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INTRODUCTION 



This application claims the benefit of the filing date of US. Application Serial Number 
09/549,848, filed April 14, 2000. 
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TECHNICAL FIELD 



The present invention is directed to nucleic acid and amino acid sequences and 
constructs, and methods related thereto. 
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BACKGROUND 



Isoprenoids are ubiquitous compounds found in all living organisms. Plants synthesize a 
diverse array of greater than 22,000 isoprenoids (Connolly and Hill (1992) Dictionary of 
Terpenoids, Chapman and Hall, New York, NY). In plants, isoprenoids play essential roles in 
particular cell functions such as production of sterols, contributing to eukaryotic membrane 

20 architecture, acyclic polyprenoids found in the side chain of ubiquinone and plastoquinone, 

growth regulators like abscisic acid, gibberellins, brassinosteroids or the photosynthetic pigments 
chlorophylls and carotenoids. Although the physiological role .of other plant isoprenoids is less 
evident, like that of the vast array of secondary metabolites, some are known to play key roles 
mediating the adaptative responses to different environmental challenges. In spite of the 

25 remarkable diversity of structure and function, all isoprenoids originate from a single metabolic 



precursor, isopentenyl diphosphate (IPP) (Wright, (\%\)Amm. Rev. Biochem. 20:525-548; and 
Spurgeon and Porter, (1981) in Biosynthesis of Isoprenoid Compounds ., Porter and Spurgeon eds 
(John Wiley, New York) Vol. 1, ppl-46). 

A number of unique and interconnected biochemical pathways derived from the 
30 isoprenoid pathway leading to secondary metabolites, including tocopherols, exist in chloroplasts 
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of higher plants. Tocopherols not only perform vital functions in plants, but arc also important 
from mamivalian nutritional perspectives. In plastids, tocopherols account for up to 40% of the 
total quinone pool. 

Tocopherols and locotrienols (unsaturated tocopherol derivatives) arc well known 
antioxidants, and play an important role in protecting cells from free radical damage, and in the 
prevention of many diseases, including cardiac disease, cancer, cataracts, retinopathy, 
Alzheimer's disease, and neurodegeneration, and have been shown to have beneficial effects on 
symptoms of arthritis, and in anti-aging. Vitamin E is used in chicken feed for improving the 
shelf life, appearance, flavor, and oxidative stability of meat, and to transfer tocols from feed to 
eggs. Vitamin E has been shown to be essential for normal reproduction, improves overall 
performance, and enhances immunocompetence in livestock animals. Vitamin E supplement in 
animal feed also imparts oxidative stability to milk products. 

The demand for natural tocopherols as supplements has been steadily growing at a rate of 
1 0-20% for the past three years. At present, the demand exceeds the supply for natural 
tocopherols, which are known to be more biopotent than racemic mixtures of synthetically 
produced tocopherols. Naturally occurring tocopherols are all ^-stereomers, whereas synthetic a- 
tocopherol is a mixture of eight 4 /-a-tocopherol isomers, only one of which (12.5%) is identical 
to the natural rf-ot-tocopherol. Natural rf-a-tocopherol has the highest vitamin E activity (1.49 
lU/mg) when compared to other natural tocopherols or tocotrienols. The synthetic a-tocopherol 
has a vitamin E activity of 1 .1 IU/mg. In 1 995, the worldwide market for raw refined 
tocopherols was $1020 million; synthetic materials comprised 85-88% of the market, the 
remaining 12-15% being natural materials. The best sources of natural tocopherols and 
tocotrienols are vegetable oils and grain products. Currently, most of the natural Vitamin E is 
produced from y-tocopherol derived from soy oil processing, which is subsequently converted to 
a-tocopherol by chemical modification (a-tocopherol exhibits the greatest biological activity). 

Methods of enhancing the levels of tocopherols and tocotrienols in plants, especially levels 
of the more desirable compounds that can be used directly, without chemical modification, would be 
useful to the art as such molecules exhibit better functionality and biovai lability. 
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In addition, methods for the increased production of other isoprenoid derived compounds in 
a host plant cell is desirable. Furthermore, methods lor the production of particular isoprenoid 
compounds in a host plant cell is also needed. 

5 

SUMMARY OF THE INVENTION 

The present invention is directed to sequences to proteins involved in tocopherol 
synthesis. The polynucleotides and polypeptides of the present invention include those derived 
10 from prokaryotic and eukaryotic sources. 

Thus, one aspect of the present invention relates to prenyltransferase, and in particular to 
isolated polynucleotide sequences encoding prenyltransferase proteins and polypeptides related 
thereto. In particular, isolated nucleic acid sequences encoding prenyltransferase proteins from 
bacterial and plant sources are provided. 
15 - •. In another aspect, the present invention provides isolated polynucleotide sequences 

encoding tocopherol cyclase, and polypeptides related thereto. In particular, isolated nucleic acid 
sequences encoding tocopherol cyclase proteins from bacterial and plant sources are provided. 

Another aspect of the present invention relates to oligonucleotides which include partial 
or complete prenyltransferase or tocopherol cyclase encoding sequences. 
20 It is also an aspect of the present invention to provide recombinant DNA constructs which 

can be used for transcription or transcription and translation (expression) of prenyltransferase or 
tocopherol cyclase. In particular, constructs are provided which are capable of transcription or 
transcription and translation in host cells. 

In another aspect of the present invention, methods are provided for production of 
25 prenyltransferase or tocopherol cyclase in a host cell or progeny thereof. In particular, host cells 
are transformed or transfected with a DNA construct which can be used for transcription or 
transcription and translation of prenyltransferase or tocopherol cyclase. The recombinant cells 
which contain prenyltransferase or tocopherol cyclase are also part of the present invention. 

In a further aspect, the present invention relates to methods of using polynucleotide and 
30 polypeptide sequences to modify the tocopherol content of host cells, particularly in host plant 
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cells. Plant cells having such a modified tocopherol content arc also contemplated herein. 
Methods and cells in which both prcnyltranslirase and tocopherol cyclase are expressed in a host 
cell are also pari of the present invention. 

The modified plants, seeds and oils obtained by the expression of the prenyltransferasc or 
tocopherol cyclase are also considered part of the invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 provides an amino acid sequence alignment between ATPT2, ATPT3, ATPT4, 
ATPT8, and ATPT1 2 are performed using ClustalW. 

Figure 2 provides a schematic picture of the expression construct pCGNl 0800. 
Figure 3 provides a schematic picture of the expression construct pCGN 10801. 
Figure 4 provides a schematic picture of the expression construct pCGN 10803. 
Figure 5 provides a schematic picture of the construct pCGN 10806. 
Figure 6 provides a schematic picture of the construct pCGNI0807. 
Figure 7 provides a schematic picture of the construct pCGN 10808. 
Figure 8 provides a schematic picture of the expression construct pCGN 10809. 
Figure 9 provides a schematic picture of the expression construct pCGN108l0. 
Figure 10 provides a schematic picture of the expression construct pCGN1081 1. 
Figure 1 1 provides a schematic picture of the expression construct pCGN10812. 
Figure 12 provides a schematic picture of the expression construct pCGN10813. 
Figure 13 provides a schematic picture of the expression construct pCGN10814. 
Figure 14 provides a schematic picture of the expression construct pCGN10815. 
Figure 1 5 provides a schematic picture of the expression construct pCGN 1 08 1 6. 
Figure 16 provides a schematic picture of the expression construct pCGN10817. 
Figure 17 provides a schematic picture of the expression construct pCGN10819. 
Figure 18 provides a schematic picture of the expression construct pCGN 10824. 
Figure 19 provides a schematic picture of the expression construct pCGN10825. 
Figure 20 provides a schematic picture of the expression construct pCGN 10826. 
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Figure 21 provides an amino acid sequence alignment using ClustalW between the 
Synechocystis prenyltransferase sequences. 

Figure 22 provides an amino acid sequence of the ATPT2, ATPT3, ATPT4. ATPT8, and 
ATPT1 2 protein sequences from Arahidopsis and the sir 1 736. sir0926, sll 1 899, slr0056, and the 
5 slrl 5 1 8 amino acid sequences from Synechocystis. 

Figure 23 provides the results of the enzymatic assay from preparations of wild type 
Synechocystis strain 6803, and Synechocystis slrl 736 knockout. 

Figure 24 provides bar graphs of HPLC data obtained from seed extracts of transgenic 
Arahidopsis containing pCGN10822, which provides of the expression of the ATPT2 sequence, 
10 in the sense orientation, from the napin promoter. Provided are graphs for alpha, gamma, and 
delta tocopherols, as well as total tocopherol for 22 transformed lines, as well as a 
nontransformed (wildtype) control. 

Figure 25 provides a bar graph of HPLC analysis of seed extracts from Arahidopsis plants 
transformed with pCGNI0803 (35S-ATPT2, in the antisense orientation), pCGN 10822 (line 
15 - 1.625, napin ATPT2 in the sense orientation), pCGN 10809 (line 1627, 35S-ATPT3 in the sense 
orientation), a nontransformed (wt) control, and an empty vector transformed control. 

Figure 26 shows total tocopherol levels measured in T# Arahidopsis seed of line. 

Figure 27 shows total tocopherol levels measured in T# Arahidopsis seed of line. 

Figure 28 shows total tocopherol levels measured in developing canola seed of line 
20 10822-1. 

Figure 29: shows results of phytyl prenyltransferase activity assay using Synechocystis 
wild type and sir 173 7 knockout mutant membrane preparations. 

Figure 30 is the chromatograph from an HPLC analysis of Synechocystis extracts. 

Figure 31 is a sequence alignment of the Arahidopsis homologue with the sequence of the 
25 public database. 

Figure 32 shows the results of hydropathic analysis of slrl 737 

Figure 33 shows the results of hydropathic analysis of the Arahidopsis homologue of 
slrl 737. 

Figure 34 shows the catalytic mechanism of various cyclase enzymes 
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Figure 35 is a sequence alignment of slrl 737, sir 1 737 Arabidopsis homologue and the 
Arahidopsis chalcone isomcrasc. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides, inter alia, compositions and methods for altering (for 
example, increasing and decreasing) the tocopherol levels and/or modulating their ratios in host 
cells. In particular, the present invention provides polynucleotides, polypeptides, and methods of 
use thereof for the modulation of tocopherol content in host plant cells. 

The biosynthesis of a-tocopherol in higher plants involves condensation of homogentisic 
acid and phytylpyrophosphate to form 2-methyl-6 phytylbenzoquinol that can, by cyclization and 
subsequent methylations (Fiedler etal., 1982, Planta, 155:511-515, Soil eta!., \9&0, Arck 
Biochem. Biophys. 204: 544-550, Marshall et al., 1985 Phytochem., 24: 1705-171 1, all of which 
are herein incorporated by reference in their entirety), form various tocopherols. 
: . The Arabidopsis pds2 mutant identified and characterized by Norris et al. (1995), is 
deficient in tocopherol and plastiquinone-9 accumulation. Further genetic and biochemical 
analysis suggested that the protein encoded by PDS2 may be responsible for the prenylation of 
homogentisic acid. The PDS2 locus identified by Norris et al. (1995) has been hypothesized to 
possibly encode the tocopherol phytyl-prenyltransferase, as the pds2 mutant fails to accumulate 
tocopherols. 

Norris et al. (1995) determined that in Arabidopsis pds2 lies at the top of chromosome 3, 
approximately 7 centimorgans above long hypocotyl2, based on the genetic map. ATPT2 is 
located on chromosome 2 between 36 and 41 centimorgans, lying on BAC F19F24, indicating 
that ATPT2 does not correspond to PDS2. Thus, it is an aspect of the present invention to 
provide novel polynucleotides and polypeptides involved in the prenylation of homogentisic acid. 
This reaction may be a rate limiting step in tocopherol biosynthesis, and this gene has yet to be 
isolated. 

U.S. Patent No. 5,432,069 describes the partial purification and characterization of 
tocopherol cyclase from Chlorella protothecoides, DmiaJiella sal'ma and wheat. The cyclase 
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described as being glycine rich, water soluble and with a predicted MW of48-50kDa. However, 
only limited peptide fragment sequences were available. 

In one aspect, the present invention provides polynucleotide and polypeptide sequences 
involved in the prcnylation of straight chain and aromatic compounds. Straight chain 
5 prenyltransferases as used herein comprises sequences which encode proteins involved in the 
prenylation of straight chain compounds, including, but not limited to, geranyl geranyl 
pyrophosphate and farnesyl pyrophosphate. Aromatic prenyltransferases, as used herein, 
comprises sequences which encode proteins involved in the prenylation of aromatic compounds, 
including, but not limited to, menaquinone, ubiquinone, chlorophyll, and homogentisic acid. The 
10 prenyl transferase of the present invention preferably prenylates homogentisic acid. 

In another aspect, the invention provides polynucleotide and polypeptide sequences to 
tocopherol cyclization enzymes. The 23-dimethvl-5-phytylplastoquinol cyclase ("tocopherol 
cyclase) is responsible for the cyclization of 2,3-dimethyl-5-phytylplastoquinoI to tocopherol. 

15 - Isolated Polynucleotides, Proteins, and Polypeptides 

A first aspect of the present invention relates to isolated prenyltransferase 
polynucleotides. Another aspect of the present invention relates to isolated tocopherol cyclase 
polynucleotides. The polynucleotide sequences of the present invention include isolated 

20 polynucleotides that encode the polypeptides of the invention having a deduced amino acid 
sequence selected from the group of sequences set forth in the Sequence Listing and to other 
polynucleotide sequences closely related to such sequences and variants thereof. 

The invention provides a polynucleotide sequence identical over its entire length to each 
coding sequence as set forth in the Sequence Listing. The invention also provides the coding 

25 sequence for the mature polypeptide or a fragment thereof, as well as the coding sequence for the 
mature polypeptide or a fragment thereof in a reading frame with other coding sequences, such as 
those encoding a leader or secretory sequence, a pre-, pro-, or prepro- protein sequence. The 
polynucleotide can also include non-coding sequences, including for example,- but not limited to, 
non-coding 5' and 3* sequences, such as the transcribed, untranslated sequences, termination 

30 signals, ribosome binding sites, sequences that stabilize mRNA, introns, polyadenylation signals, 
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and additional coding sequence thai encodes additional amino acids. For example, a marker 
sequence can be included to facilitate the purification of the fused polypeptide. Polynucleotides 
of ihe present invention also include polynucleotides comprising a structural gene and the 
naturally associated sequences that control gene expression. 
5 The invention also includes polynucleotides of the formula: 

X-(R0n-(R2>(R3)n-Y 

wherein, at the 5* end, X is hydrogen, and at the 3' end, Y is hydrogen or a metal, Rj and R 3 are 
any nucleic acid residue, n is an integer between 1 and 3000, preferably between 1 and 1000 and 
R 2 is a nucleic acid sequence of the invention, particularly a nucleic acid sequence selected from 

10 the group set forth in the Sequence Listing and preferably those of SEQ ID NOs: 1, 3, 5, 7, 8, 10, 
1 1, 13-16, 1 8, 23, 29, 36, and 38. In the formula, R 2 is oriented so that its 5' end residue is at the 
left, bound to R», and its 3' end residue is at the right, bound to R 3 . Any stretch of nucleic acid 
residues denoted by either R group, where R is greater than 1, may be either a hcteropolymer or a 
homopolymer, preferably a heteropolymer. 

15- ; . The invention also relates to variants of the polynucleotides described herein that encode 
for variants of the polypeptides of the invention. Variants that are fragments of the 
polynucleotides of the invention can be used to synthesize full-length polynucleotides of the 
invention. Preferred embodiments are polynucleotides encoding polypeptide variants wherein 5 
to 1 0, 1 to 5, 1 to 3, 2, 1 or no amino acid residues of a polypeptide sequence of the invention are 

2 0 substituted, added or deleted, in any combination. Particularly preferred are substitutions, 

i 

additions, and deletions that are silent such that they do not alter the properties or activities of the 
polynucleotide or polypeptide. 

Further preferred embodiments of the invention that are at least 50%, 60%, or 70% 
identical over their entire length to a polynucleotide encoding a polypeptide of the invention, and 
2 5 polynucleotides that are complementary to such polynucleotides. More preferable are 

polynucleotides that comprise a region that is at least 80% identical over its entire length to a 
polynucleotide encoding a polypeptide of the invention and polynucleotides that arc 
complementary thereto. In this regard, polynucleotides at least 90% identical over their entire 
length are particularly preferred, those at least 95% identical are especially preferred. Further, 
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ihose with at least 97% identity are highly preferred and those with at least 98% and 99% identity 
are particularly highly preferred, with those at least 99% being the most highly preferred. 

Preferred embodiments are polynucleotides that encode polypeptides that retain 
substantially the same biological function or activity as the mature polypeptides encoded by the 
polynucleotides set forth in the Sequence Listing. 

The invention further relates to polynucleotides that hybridize to the above-described 
sequences. In particular, the invention relates to polynucleotides that hybridize under stringent 
conditions to the above-described polynucleotides. As used herein, the terms "stringent 
conditions" and "stringent hybridization conditions" mean that hybridization will generally occur 
if there is at least 95% and preferably at least 97% identity between the sequences. An example 
of stringent hybridization conditions is overnight incubation at 42°C in a solution comprising 
50% formamide, 5x SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate 
(pH 7.6), 5x Denhardt's solution, 1 0% dextran sulfate, and 20 micrograms/milliliter denatured, 
sheared salmon sperm DNA, followed by washing the hybridization support in 0. Ix SSC at 
approximately 65°C. Other hybridization and wash conditions are well known and are 
exemplified in Sambrook, et al., Molecular Cloning: A Laboratory Manual, Second Edition, cold 
Spring Harbor, NY (1 989), particularly Chapter 11. 

The invention also provides a polynucleotide consisting essentially of a polynucleotide 
sequence obtainable by screening an appropriate library containing the complete gene for a 
polynucleotide sequence set for in the Sequence Listing under stringent hybridization conditions 
with a probe having the sequence of said polynucleotide sequence or a fragment thereof; and 
isolating said polynucleotide sequence. Fragments useful for obtaining such a polynucleotide 
include, for example, probes and primers as described herein. 

As discussed herein regarding polynucleotide assays of the invention, for example, 
polynucleotides of the invention can be used as a hybridization probe for RNA, cDNA, or 
genomic DNA to isolate full length cDNAs or genomic clones encoding a polypeptide and to 
isolate cDNA or genomic clones of other genes that have a high sequence similarity to a 
polynucleotide set forth in the Sequence Listing. Such probes will generally comprise at least 15 
bases. Preferably such probes will have at least 30 bases and can have at least 50 bases. 
Particularly preferred probes will have between 30 bases and 50 bases, inclusive. 
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The coding region of each gene that comprises or is comprised by a polynucleotide 
sequence scl forth in the Sequence Listing may be isolated by screening using a DNA sequence 
provided in the Sequence Listing to synthesize an oligonucleotide probe. A labeled 
oligonucleotide having a sequence complementary to that of a gene of the invention is then used 
5 to screen a library of cDNA, genomic DNA or mRNA to identify members of the library which 
hybridize to the probe. For example, synthetic oligonucleotides are prepared which correspond 
to the prenyl transferase or tocopherol cyclase EST sequences. The oligonucleotides are used as 
primers in polymerase chain reaction (PCR) techniques to obtain 5' and 3' terminal sequence of 
prenyltransferase or tocopherol cyclase genes. Alternatively, where oligonucleotides of low 

10 degeneracy can be prepared from particular prenyltransferase or tocopherol cyclase peptides, 
such probes may be used directly to screen gene libraries for prenyltransferase or tocopherol 
cyclase gene sequences. In particular, screening of cDNA libraries in phage vectors is useful in 
such nnethods due to lower levels of background hybridization. 

Typically, a prenyltransferase or tocopherol cyclase sequence obtainable from the use of 

1 5 • nucleic acid probes will show 60-70% sequence identity between the target prenyltransferase or 
tocopherol cyclase sequence and the encoding sequence used as a probe. However, lengthy 
sequences with as little as 50-60% sequence identity may also be obtained. The nucleic acid 
probes may be a lengthy fragment of the nucleic acid sequence, or may also be a shorter, 
oligonucleotide probe. When longer nucleic acid fragments are employed as probes (greater than 
, 20 about 100 bp), one may screen at lower stringencies in order to obtain sequences from the target 
sample which have 20-50% deviation (i.e., 50-80% sequence homology) from the sequences 
used as probe. Oligonucleotide probes can be considerably shorter than the entire nucleic acid 
sequence encoding an prenyltransferase or tocopherol cyclase^ enzyme, but should be at least 
about 10, preferably at least about 15, and more preferably at least about 20 nucleotides. A 

25 higher degree of sequence identity is desired when shorter regions are used as opposed to longer 
regions. Jt may thus be desirable to identify regions of highly conserved amino acid sequence to 
design oligonucleotide probes for detecting and recovering other related prenyltransferase or 
tocopherol cyclase genes. Shorter probes are often particularly useful for polymerase chain 
reactions (PCR), especially when highly conserved sequences can be identified. (See, Gould, et 

30 aLPNAS USA (1989)56:1934-1938.). 

10 
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Another aspect ofthe present invention relates to prcnyllransfcrase or tocopherol cyclase 
polypeptides. Such polypeptides include isolated polypeptides set forth in the Sequence Listing, 
as well as polypeptides and fragments thereof, particularly those polypeptides which exhibit 
prcnyltransfcrasc or tocopherol cyclase activity and also those polypeptides which have at least 
5 50%, 60% or 70% identity, preferably at least 80% identity, more preferably at least 90% 

identity, and most preferably at least 95% identity to a polypeptide sequence selected from the 
group of sequences set forth in the Sequence Listing, and also include portions of such 
polypeptides, wherein such portion of the polypeptide preferably includes at least 30 amino acids 
and more preferably includes at least 50 amino acids. 
1 0 "Identity", as is well understood in the art, is a relationship between two or more 

polypeptide sequences or two or more polynucleotide sequences, as determined by comparing the 
sequences. In the art, "identity" also means the degree of sequence relatedness between 
polypeptide or polynucleotide sequences, as determined by the match between strings of such 
sequences. "Identity" can be readily calculated by known methods including, but not limited to, 
1 5 - those described in Computational Molecular Biology, Lesk, A.M., ed., Oxford University Press, 
New York (1 988); Biocomputing: Informatics and Genome Projects, Smith, D.W., ed., 
Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part I, Griffin, A.M. 
and Griffin, H.G., eds., Humana Press, New Jersey (1994); Sequence Analysis in Molecular 
Biology, von Heinje, G., Academic Press (1987); Sequence Analysis Primer, Gribskov, M. and 
2 0 Devereux, J., eds., Stockton Press, New York (199 1); and Carillo, H., and Lipman, D., SI AM J 
Applied Math, 48:1 073 (1 988). Methods to determine identity are designed to give the largest 
match between the sequences tested. Moreover, methods to determine identity are codified in 
publicly available programs. Computer programs which can be used to determine identity 
between two sequences include, but are not limited to, GCG (Devereux, J., et al. Nucleic Acids 
25 Research 12(1):387 (1984); suite of five BLAST programs, three designed for nucleotide 

sequences queries (BLASTN, BLASTX, and TBLASTX) and two designed for protein sequence 
queries (BLASTP and TBLASTN) (Coulson, Trends in Biotechnology, 12: 76-80 (1994); Birren, 
et al, Genome Analysis, I: 543-559 (1997)). The BLAST X program is publicly available from 
NCBI and other sources (BLAST Manual, Altschul, S.. et a!., NCBI NLM NIH, Bethesda, MD 
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20894; Altschul, S., etaLJ. MoL BioL, 215:403-410 (1990)). The well known Smith Waterman 
algorithm can also be used to determine identity. 

Parameters for polypeptide sequence comparison typically include the following: 

Algorithm: Necdlcman and Wunsch,./. MoL Biol. 48:443-453 ( 1970) 
5 Comparison matrix: BLOSSUM62 from HentikofTand Hentikoff, Proc. Nail. Acad Sci 

USA 89:10915-10919 (1992) 

Gap Penalty: 12 

Gap Length Penalty: 4 

A program which can be used with these parameters is publicly available as the a gap" 
10 program from Genetics Computer Group, Madison Wisconsin. The above parameters along with 
no penalty for end gap are the default parameters for peptide comparisons. 

Parameters for polynucleotide sequence comparison include the following: 

Algorithm: Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970) 

Comparison matrix: matches = +10; mismatches = 0 
15- . . Gap Penalty: 50 

Gap Length Penalty: 3 

A program which can be used with these parameters is publicly available as the "gap" 
program from Genetics Computer Group, Madison Wisconsin. The above parameters are the 
default parameters for nucleic acid comparisons. 
2 0 The invention also includes polypeptides of the formula: 

• X-(R0n-(R 2 KR3)n-Y 

wherein, at the amino terminus, X is hydrogen, and at the carboxyl terminus, Y is hydrogen or a 
metal, K\ and R3 are any amino acid residue, n is an integer between 1 and 1000, and R2 is an 
amino acid sequence of the invention, particularly an amino acid sequence selected from the 

2 5 group set forth in the Sequence Listing and preferably those encoded by the sequences provided 
in SEQ ID NOs: 2, 4, 6, 9, 12, 17, 19-22, 24-28, 30, 32-35, 37, and 39. In the formula, R 2 is 
oriented so that its amino terminal residue is at the left, bound to Rj, and its carboxy terminal 
residue is at the right, bound to R3. Any stretch of amino acid residues denoted by either R 
group, where R is greater than 1, may be either a heteropolymer or a homopolymer, preferably a 

30 heteropolymer. 



12 



WO 02/33060 



PCT/US01/42673 



Polypeptides of the present invention include isolated polypeptides encoded by a 
polynucleotide comprising a sequence selected from the group of a sequence contained in the 
Sequence Listing set forth herein . 

The polypeptides of the present invention can be mature protein or can be part ofa fusion 

protein. 

Fragments and variants of the polypeptides are also considered to be a part of the 
invention. A fragment is a variant polypeptide which has an amino acid sequence that is entirely 
the same as part but not all of the amino acid sequence of the previously described polypeptides. 
The fragments can be "free-standing" or comprised within a larger polypeptide of which the 
fragment forms a part or a region, most preferably as a single continuous region. Preferred 
fragments are biologically active fragments which are those fragments that mediate activities of 
the polypeptides of the invention, including those with similar activity or improved activity or 
with a decreased activity. Also included are those fragments that antigenic or immunogenic in an 
animal, particularly a human. 

; . Variants of the polypeptide also include polypeptides that vary from the sequences set 
forth in the Sequence Listing by conservative amino acid substitutions, substitution ofa residue 
by another with like characteristics. In general, such substitutions are among Ala, Val, Leu and 
He; between Ser and Thr; between Asp and Glu; between Asn and Gin; between Lys and Arg; or 
between Phe and Tyr. Particularly preferred are variants in which 5 to 10; 1 to 5; 1 to 3 or one 
amino acid(s) are substituted, deleted, or added, in any combination. 

Variants that are fragments of the polypeptides of the invention can be used to produce 
the corresponding full length polypeptide by peptide synthesis. Therefore, these variants can be 
used as intermediates for producing the full-length polypeptides of the invention. 

The polynucleotides and polypeptides of the invention can be used, for example, in the 
transformation of host cells, such as plant host cells, as further discussed herein. 

The invention also provides polynucleotides that encode a polypeptide that is a mature 
protein plus additional amino or carboxyl-terminal amino acids, or amino acids within the mature 
polypeptide (for example, when the mature form of the protein has more than one polypeptide 
chain). Such sequences can, for example, play a role in the processing of a protein from a 
precursor to a mature form, allow protein transport, shorten or lengthen protein half-life, or 
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facilitate manipulation ofthe protein in assays or production. It is contemplated that cellular 
enzymes can be used to remove any additional amino acids from the mature protein. 

A precursor protein, having the mature form ofthe polypeptide fused to one or more 
prosequences may be an inactive form ofthe polypeptide. The inactive precursors generally arc 
activated when the prosequences are removed. Some or all of the prosequences may be removed 
prior to activation. Such precursor protein are generally called proproteins. 

Plant Constructs and Methods of Use 

Of particular interest is the use ofthe nucleotide sequences in recombinant DNA 
constructs to direct the transcription or transcription and translation (expression) ofthe 
prenyltransferase or tocopherol cyclase sequences ofthe present invention in a host plant cell. 
The expression constructs generally comprise a promoter functional in a host plant cell operably 
linked to a nucleic acid sequence encoding a prenyltransferase or tocopherol cyclase ofthe 
- present invention and a transcriptional termination region functional in a host plant cell. 

A first nucleic acid sequence is "operably linked" or "operably associated" with a second 
nucleic acid sequence when the sequences are so arranged that the first nucleic acid sequence 
affects the function ofthe second nucleic-acid sequence. Preferably, the two sequences are part 
of a single contiguous nucleic acid molecule and more preferably are adjacent. For example, a 
promoter is operably linked to a gene if the promoter regulates or mediates transcription ofthe 
gene in a cell. 

Those skilled in the art will recognize that there are a number of promoters which are 
functional in plant cells, and have been described in the literature. Chloroplast and plastid 
specific promoters, chloroplast or plastid functional promoters, and chloroplast or plastid 
operable promoters are also envisioned. 

One set of plant functional promoters are constitutive promoters such as the CaMV35S or 
FMV35S promoters that yield high levels of expression in most plant organs. Enhanced or 
duplicated versions ofthe CaMV35S and FMV35S promoters are useful in the practice of this 
invention (Odell, el al (1985) Nature 313:8 10-812; Rogers, U.S. Patent Number 5,378, 619). In 
addition, it may also be preferred to bring about expression ofthe prenyltransferase or tocopherol 
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cyclase gene in specific tissues of the plant, such as leaf, stem, root, tuber, seed, fruit, etc, and 
the promoter chosen should have the desired tissue and developmental specificity. 

Of particular interest is the expression of the nucleic acid sequences of the present 
invention from transcription initiation regions which are preferentially expressed in a plant seed 
5 tissue. Examples of such seed preferential transcription initiation sequences include those 

sequences derived from sequences encoding plant storage protein genes or from genes involved 
in fatty acid biosynthesis in oilseeds. Examples of such promoters include the 5' regulatory 
regions from such genes as napin (Kridl et ai, SeedScL Res. /;209:219 (1991)), phaseolin, zein, 
soybean trypsin inhibitor, ACP, stearoyl-ACP desaturase, soybean a 5 subunit of (3-conglycinin 

10 (soy 7s, (Chen et al 9 Proc. Natl. Acad Sci., 83:8560-8564 (1986))) and oleosin. 

It may be advantageous to direct the localization of proteins conferring prenyl transferase 
or tocopherol cyclase to a particular subcellular compartment, for example, to the mitochondrion, 
endoplasmic reticulum, vacuoles, chloroplast or other plastidic compartment. For example, 
where the genes of interest of the present invention will be targeted to plastids, such as 

15 • chloroplasts, for expression, the constructs will also employ the use of sequences to direct the 

gene to the plastid. Such sequences are referred to herein as chloroplast transit peptides (CTP) or 
plastid transit peptides (PTP). In this manner, where the gene of interest is not directly inserted 
into the plastid, the expression construct will additionally contain a gene encoding a transit 
peptide to direct the gene of interest to the plastid. The chloroplast transit peptides may be 

20 derived from the gene of interest, or may be derived from a heterologous sequence having a CTP. 
Such transit peptides are known in the art. See, for example, Von Heijne et al. (1991) Plant Mol. 
Biol. Rep. P:104-126; Clarke/ al. (1989)7. Biol. Chem. 264: 17544- 17550; della-Cioppa et al. 
(1987) Plant Physiol. 84:965-96%; Romer et al. (1993) Biochem. Biophys. Res Commun. 
/P&1414-1421; and, Shah et al. (1986) Science 253:478-481. 

2 5 Depending upon the intended use, the constructs may contain the nucleic acid sequence 

i 

which encodes the entire prenyltransferase or tocopherol cyclase protein, or a portion thereof. 
For example, where antisense inhibition of a given prenyltransferase or tocopherol cyclase 
protein is desired, the entire prenyltransferase or tocopherol cyclase sequence is not required. 
Furthermore, where prenyltransferase or tocopherol cyclase sequences used in constructs are 
30 intended for use as probes, it may be advantageous to prepare constructs containing only a 

15 
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particular portion of a prenyltransferase or tocopherol cyclase encoding sequence. Tor example a 
sequence which is discovered to encode a highly conserved prenyltransferase or tocopherol 
cyclase region. 

The skilled artisan will recognize that there are various methods for the inhibition of 
5 expression of endogenous sequences in a host cell. Such methods include, but arc not limited to, 
antisense suppression (Smith, et ai (1988) Nature 334:724-726) , co-suppression (Napoli, et ai 
(1989) Plant Celt 2:279-289), ribozymes (PCT Publication WO 97/10328), and combinations of % 
sense and antisense Waterhouse, et al (1998) Proc. Natl. Acad. ScL USA 95:13959-13964. 
Methods for the suppression of endogenous sequences in a host cell typically employ the 
10 transcription or transcription and translation of at least a portion of the sequence to be 

suppressed. Such sequences may be homologous to coding as well as non-coding regions of the 
endogenous sequence. 

Regulatory transcript termination regions may be provided in plant expression constructs 
of this invention as well. Transcript termination regions may be provided by the DNA sequence 

1 5 - encoding the prenyltransferase or tocopherol cyclase or a convenient transcription termination 

region derived from a different gene source, for example, the transcript termination region which 
is naturally associated with the transcript initiation region. The skilled artisan will recognize that 
any convenient transcript termination region which is capable of terminating transcription in a 
plant cell may be employed in the constructs of the present invention. 

2 0 Alternatively, constructs may be prepared to direct the expression of the prenyltransferase 

or tocopherol cyclase sequences directly from the host plant cell plastid. Such constructs and 
methods are known in the art and are generally described, for example, in Svab, et ai (1990) 
Proa Natl. Acad Sci USA 87:8526-8530 and Svab and Maliga (1993) Proc. Natl. Acad ScL 
USA 90:913-917 and in U.S. Patent Number 5,693,507. 

25 The prenyltransferase or tocopherol cyclase constructs of the present invention can be 

■+ 

used in transformation methods with additional constructs providing for the expression of other - 

* 

nucleic acid sequences encoding proteins involved in the production of tocopherols, or 
tocopherol precursors such as homogentisic acid and/or phytylpyrophosphate. Nucleic acid 
sequences encoding proteins involved in the production of homogentisic acid are known in the 
30 art, and include but not are limited to, 4-hydroxyphenylpyruvate dioxygenase (HPPD, EC 

16 
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1. 13.1 1.27) described for example, by Garcia, at ai ((1999) Plant PhysioL 1 19(4): 1 507-1 5 ! 6). 
mono or biiunclioiuil lyrA (described for example by Xia, at ai (1992) Gen Microbiol. 
138:1309-1316, and Hudson, at ui (1984) ./ Moi Biol. 180:1023-1051), Oxygenase, 4- 
hydroxyphenylpyruvate di- (9CI), 4- Hydroxyphcnyl pyruvate dioxygenase; p- 
5 Hydroxyphenylpyruvale dioxygenase; p-Hydroxyphenylpyruvate hydroxylase; p- 
Hydroxyphenylpyruvate oxidase; p-Hydroxyphenylpyruvic acid hydroxylase; p- 
Hydroxyphenylpyruvic hydroxylase; p-Hydroxyphenylpyruvic oxidase), 4-hydroxyphenylacetate, 
NAD(P)H:oxygen oxidoreductase (1-hydroxylating); 4-hydroxyphenylacetate 1-monooxygenase, 
and the like. In addition, constructs for the expression of nucleic acid sequences encoding 

10 proteins involved in the production of phytylpyrophosphate can also be employed with the 
prenyltransferase or tocopherol cyclase constructs of the present invention. Nucleic acid 
sequences encoding proteins involved in the production of phytylpyrophosphate are known in the 
art, and include, but are not limited to geranylgeranylpyrophosphate synthase (GGPPS), 
geranylgeranylpyrophosphate reductase (GGH), l-deoxyxylulose-5-phosphate synthase, 1- 

1 5 - deoxy-D-xyIolose-5-phosphate reductoisomerase, 4-diphosphocytidyl-2-C-methylerythritol 
synthase, isopentyl pyrophosphate isomerase. 

The prenyltransferase or tocopherol cyclase sequences of the present invention find use in 
the preparation of transformation constructs having a second expression cassette for the 
expression of additional sequences involved in tocopherol biosynthesis. Additional tocopherol 

20 biosynthesis sequences of interest in the present invention include, but are not limited to gamma- 
tocpherol methyltransferase (Shintani, et ai (1998) Science 282(5396):2098-2100), tocopherol 
cyclase, and tocopherol methyltransferase. 

A plant cell, tissue, organ, or plant into which the recombinant DNA constructs 
containing the expression constructs have been introduced is considered transformed, transfected, 

25 or transgenic. A transgenic or transformed cell or plant also includes progeny of the cell or plant 
and progeny produced from a breeding program employing such a transgenic plant as a parent in 
a cross and exhibiting an altered phenotype resulting from the presence of a prenyltransferase or 
tocopherol cyclase nucleic acid sequence. 

Plant expression or transcription constructs having a prenyltransferase or tocopherol 

30 cyclase as the DNA sequence of interest for increased or decreased expression thereof may be 
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employed with a wide variety of plant life, particularly, plant life involved in the production of 
vegetable oils for edible and industrial uses. Particularly preferred plants for use in the methods 
orthc present invention include, but are not limited to: Acacia, alfalfa, aneth, apple, apricot 
artichoke, arugula, asparagus, avocado, banana, barley, beans, beet, blackberry, blueberry, 
broccoli, brussels sprouts, cabbage, canola, cantaloupe, carrot, cassava, cauliflower, celery, 
cherry, chicory, cilantro, citrus, Clementines, coffee, corn, cotton, cucumber, Douglas fir, 
eggplant, endive, escarole, eucalyptus, fennel, figs, garlic, gourd, grape, grapefruit, honey dew, 
jicama, kiwifruit, lettuce, leeks, lemon, lime, Loblolly pine, mango, melon, mushroom, nectarine, 
nut, oat, oil palm, oil seed rape, okra, onion, orange, an ornamental plant, papaya, parsley, pea, 
peach, peanut, pear, pepper, persimmon, pine, pineapple, plantain, plum, pomegranate, poplar, 
potato, pumpkin, quince, radiata pine, radicchio, radish, raspberry, rice, rye, sorghum, Southern 
pine, soybean, spinach, squash, strawberry, sugarbeet, sugarcane, sunflower, sweet potato, 
sweetgum, tangerine, tea, tobacco, tomato, triticale, turf, turnip, a vine, watermelon, wheat, yams, 
and zucchini. 

Most especially preferred are temperate oilseed crops. Temperate oilseed crops of 
interest include, but are not limited to, rapeseed (Canola and High Erucic Acid varieties), 
sunflower, safflower, cotton, soybean, peanut, coconut and oil palms, and corn. Depending on 
the method for introducing the recombinant constructs into the host cell, other DN A sequences 
may be required. Importantly, this invention is applicable to dicotyledyons and monocotyledons 
species alike and will be readily applicable to new and/or improved transformation and 

regulation techniques. 

Of particular interest, is the use of prenyltransferase or tocopherol cyclase constructs in 
plants to produce plants or plant parts, including, but not limited to leaves, stems, roots, 
reproductive, and seed, with a modified content of tocopherols in plant parts having transformed 
plant cells. 

For immunological screening, antibodies to the protein can be prepared by injecting 
rabbits or mice with the purified protein or portion thereof, such methods of preparing antibodies 
being well known to those in the art. Either monoclonal or polyclonal antibodies can be 
produced, although typically polyclonal antibodies are more useful for gene isolation. Western 
analysis may be conducted to determine that a related protein is present in a crude extract of the 
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desired plant species, as determined by cross-reaction with the antibodies to the encoded 
proteins. When cross-reactivity is observed, genes encoding the related proteins are isolated by 
screening expression libraries representing the desired plant species. Expression libraries can be 
constructed in a variety of commercially available vectors, including lambda gtl I, as described in 
Sambrook, et al. {Molecular Cloning: A Laboratory Manual, Second Edition (1989) Cold Spring 
Harbor Laboratory, Cold Spring Harbor, New York). 

To confirm the activity and specificity of the proteins encoded by the identified nucleic 
acid sequences as prenyltransferase or tocopherol cyclase enzymes, in vitro assays are performed 
in insect cell cultures using baculovirus expression systems. Such baculovirus expression 
systems are known in the art and are described by Lee, et al. U.S. Patent Number 5,348,886, the 
entirety of which is herein incorporated by reference. 

In addition, other expression constructs may be prepared to assay for protein activity 
utilizing different expression systems. Such expression constructs are transformed into yeast or 
prokaryotic host and assayed for prenyltransferase or tocopherol cyclase activity. Such 
. expression systems are known in the art and are readily available through commercial sources. 
In addition to the sequences described in the present invention, DNA coding sequences 
useful in the present invention can be derived from algae, fungi, bacteria, mammalian sources, 
plants, etc. Homology searches in existing databases using signature sequences corresponding 
to conserved nucleotide and amino acid sequences of prenyltransferase or tocopherol cyclase can 
be employed to isolate equivalent, related genes from other sources such as plants and 
microorganisms. Searches in EST databases can also be employed. Furthermore, the use of 
DNA sequences encoding enzymes functionally enzymatically equivalent to those disclosed 
herein, wherein such DNA sequences are degenerate equivalents of the nucleic acid sequences 
disclosed herein in accordance with the degeneracy of the genetic code, is also encompassed by 
the present invention. Demonstration of the functionality of coding sequences identified by any 
of these methods can be carried out by complementation of mutants of appropriate organisms, 
such as Synechocystis, Shewanel/a, yeast Pseudomonas, Rhodobacteria, etc., that lack specific 
biochemical reactions, or that have been mutated. The sequences of the DNA coding regions 
can be optimized by gene resynthesis, based on codon usage, for maximum expression in 
particular hosts. 
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For the alteration of tocopherol production in a host cell, a second expression construct 
can be used in accordance with the present invention. For example, the prenyl transferase or 
tocopherol cyclase expression construct can be introduced into a host cell in conjunction with a 
second expression construct having a nucleotide sequence for a protein involved in tocopherol 
5 biosynthesis. 

The method of transformation in obtaining such transgenic plants is not critical to the 
instant invention, and various methods of plant transformation are currently available. 
Furthermore, as newer methods become available to transform crops, they may also be directly 
applied hereunder. For example, many plant species naturally susceptible to Agrobacterium 

10 infection may be successfully transformed via tripartite or binary vector methods of 

Agrobacterium mediated transformation. In many instances, it will be desirable to have the 
construct bordered on one or both sides by T-DNA, particularly having the left and right borders, 
more particularly the right border. This is particularly useful when the construct uses A. 
tamefaciens or A. rhizogenes as a mode for transformation, although the T-DNA borders may 

15 . find use with other modes of transformation. In addition, techniques of microinjection, DNA 
particle bombardment, and electroporation have been developed which allow for the 
transformation of various monocot and dicot plant species. 

Normally, included with the DNA construct will be a structural gene having the necessary 
regulatory regions for expression in a host and providing for selection of transformant cells. The 

2 0 gene may provide for resistance to a cytotoxic agent, e.g. antibiotic, heavy metal, toxin, etc., 
complementation providing prototrophy to an auxotrophic host, viral immunity or the like. 
Depending upon the number of different host species the expression construct or components 
thereof are introduced, one or more markers may be employed, where different conditions for 
selection are used for the different hosts. 

2 5 Where Agrobacterium is used for plant cell transformation, a vector may be used which 

may be introduced into the Agrobacterium host for homologous recombination with T-DNA or 
the Ti- or Ri-plasmid present in the Agrobacterium host. The Ti- or Ri-plasmid containing the T- 
DNA for recombination may be armed (capable of causing gall formation) or disarmed 
(incapable of causing gall formation), the latter being permissible, so long as the vir genes are 
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present in the transformed Agrobacterium host. The armed plasmid can give a mixture of normal 
plant cells and gall. 

In some instances where Agrobacterium is used as the vehicle for transforming host plant 
cells, the expression or transcription construct bordered by the T-DNA border region(s) will be 
5 inserted into a broad host range vector capable of replication in E. coli and Agrobacterium, there 
being broad host range vectors described in the literature. Commonly used is pRK2 or 
derivatives thereof. See, for example, Ditta. et ai, (Proc. Nat. Acad Set., U.S.A. (1980) 
77:7347-7351) and EPA 0 120 515, which arc incorporated herein by reference. Alternatively, 
one may insert the sequences to be expressed in plant cells into a vector containing separate 
10 replication sequences, one of which stabilizes the vector in E. co/i, and the other in 

Agrobacterium. See, for example, McBride, et al {Plant Mol Biol. (1990) 74:269-276), wherein 
the pRiHRJ (Jouanin, et aL, Mol Gen. Genet. (1985) 201:370-374) origin of replication is 
utilized and provides for added stability of the plant expression vectors in host Agrobacteriam 
cells. 

15 - Included with the expression construct and the T-DNA will be one or more markers, 

which allow for selection of transformed Agrobacterium and transformed plant cells. A 
number of markers have been developed for use with plant cells, such as resistance to 
chloramphenicol, kanamycin, the aminoglycoside G418, hygromycin, or the like. The particular 
marker employed is not essential to this invention, one or another marker being preferred 

2 0 depending on the particular host and the manner of construction. 

For transformation of plant cells using Agrobacterium, explants may be combined and 
incubated with the transformed Agrobacterium for sufficient time for transformation, the bacteria 
killed, and the plant cells cultured in an appropriate selective medium. Once callus forms, shoot 
formation can be encouraged by employing the appropriate plant hormones in accordance with 

25 known methods and the shoots transferred to rooting medium for regeneration of plants. The 
plants may then be grown to seed and the seed used to establish repetitive generations and for 
isolation of vegetable oils. 

There are several possible ways to obtain the plant cells of this invention which contain 
multiple expression constructs. Any means for producing a plant comprising a construct having 

30 a DNA sequence encoding the expression construct of the present invention, and at least one 
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other construct having another DNA sequence encoding an enzyme arc encompassed by the 
present invention. For example, the expression construct can be used to transform a plant at the 
same time as the second construct cither by inclusion of both expression constructs in a single 
transformation vector or by using separate vectors, each of which express desired genes. The 
5 second construct can be introduced into a plant which has already been transformed with the 

prenyltransferase or tocopherol cyclase expression construct, or alternatively, transformed plants, 
one expressing the prenyltransferase or tocopherol cyclase construct and one expressing the 
second construct, can be crossed to bring the constructs together in the same plant. 

Transgenic plants of the present invention may be produced from tissue culture, and 

10 subsequent generations grown from seed. Alternatively, transgenic plants may be grown using 
apomixis. Apomixis is a genetically controlled method of reproduction in plants where the 
embryo is formed without union of an egg and a sperm. There are three basic types of apomictic 
reproduction: 1) apospory where the embryo develops from a chromosomally unreduced egg in 
an embryo sac derived from the nucleus, 2) diplospory where the embryo develops from an 

15 • unreduced egg in an embryo sac derived from the megaspore mother cell, and 3) adventitious 
embryony where the embryo develops directly from a somatic cell. In most forms of apomixis, 
pseudogamy or fertilization of the polar nuclei to produce endosperm is necessary for seed 
viability. In apospory, a nurse cultivar can be used as a pollen source for endosperm formation in 
seeds. The nurse cultivar does not affect the genetics of the aposporous apomictn cultivar since 

20 the unreduced egg of the cultivar develops parthenogenetically, but makes possible endosperm 
production. Apomixis is economically important, especially in transgenic plants, because it 
causes any genotype, no matter how heterozygous, to breed true. Thus, with apomictic 
reproduction, heterozygous transgenic plants can maintain their genetic fidelity throughout 
repeated life cycles. Methods for the production of apomictic plants are known in the art. See, 

2 5 U.S. Patent No.5,81 1 ,636, which is herein incorporated by reference in its entirety. 

The nucleic acid sequences of the present invention can be used in constructs to provide 
for the expression of the sequence in a variety of host cells, both prokaryotic eukaryotic. Host 
cells of the present invention preferably include monocotyledenous and dicotyledenous plant 
cells. 
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In general, the skilled artisan is familiar with the standard resource materials which 
describe specific conditions and procedures for the construction, manipulation and isolation of 
macromolecules (e.g.. DNA molecules, plasmids, etc.). generation of recombinant organisms and 
the screening and isolating of clones, (sec for example. Sambrook et al % Molecular Cloning: A 
5 Laboratory Manual, Gold Spring Harbor Press (1989); Maliga et al. Methods in Plant 

Molecular Biology, Cold Spring Harbor Press (1995), the entirety of which is herein incorporated 
by reference; Birren et al, Genome Analysis: Analyzing DNA, 1 , Cold Spring Harbor, New York, . 
the entirety of which is herein incorporated by reference). 

Methods for the expression of sequences in insect host cells are known in the art. 

10 Baculovirus expression vectors are recombinant insect viruses in which the coding sequence for a 
chosen foreign gene has been inserted behind a baculovirus promoter in place of the viral gene, 
e.g., polyhedrin (Smith and Summers, U.S. Pat. No., 4,745,051, the entirety of which is 
incorporated herein by reference). Baculovirus expression vectors are known in the art, and are 
described for example in Doerfler, Curr. Top. Microbiol Immunol 757:51-68 (1968); Luckow 

15 - and Summers, Bio/Technology 5:47-55 (1988a); Miller, Annual Review of Microbiol 42: 1 77-199 
(1988); Summers, Curr. Comm. Molecular Biology, Cold Spring Harbor Press, Cold Spring 
Harbor, N.Y. (1988); Summers and Smith, A Manual of Methods for Baculovirus Vectors and 
Insect Cell Culture Procedures, Texas Ag. Exper. Station Bulletin No. 1555 (1988), the 
entireties of which is herein incorporated by reference) 

20 Methods for the expression of a nucleic acid sequence of interest in a fungal host cell are 

known in the art. The fungal host cell may, for example, be a yeast cell or a filamentous fungal 
cell. Methods for the expression of DNA sequences of interest in yeast cells are generally 
described in "Guide to yeast genetics and molecular biology", Guthrie and Fink, eds. Methods in 
enzymology , Academic Press, Inc. Vol 194 (1991) and Gene expression technology", Goeddel 

25 ed, Methods in Enzymology, Academic Press, Inc., Vol 185 (1991). 

Mammalian cell lines available as hosts for expression arc known in the art and include 
many immortalized cell lines available from the American Type Culture Collection (ATCC, 
Manassas, V A), such as HeLa cells, Chinese hamster ovary (CHO) cells, baby hamster kidney 
(BHIC) cells and a number of other cell lines. Suitable promoters for mammalian cells are also 

30 known in the art and include, but are not limited to, viral promoters such as that from Simian 
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Virus 40 (SV40) (Fiers et <*//., Nature 273: 1 13 (1978), the entirety of which is herein incorporated 
by reference). Rous sarcoma virus (RSV), adenovirus (ADV) and bovine papilloma virus (BPV). 
Mammalian cells may also require terminator sequences and poly-A addition sequences. 
Enhancer sequences which increase expression may also be included and sequences which 
5 promote amplification of the gene may also be desirable (for example methotrexate resistance 
genes). 

Vectors suitable for replication in mammalian cells are well known in the art, and may 
include viral replicons, or sequences which insure integration of the appropriate sequences 
encoding epitopes into the host genome. Plasmid vectors that greatly facilitate the construction of 

1 0 recombinant viruses have been described {see, for example, M ackett et al, J Virol. 49:&57 

(1984); Chakrabarti et al, Mol Cell Biol. 5:3403 (1985); Moss, In: Gene Transfer Vectors For 
Mammalian Cells (Miller and Calos, eds., Cold Spring Harbor Laboratory, N.Y., p. 10, (1987); 
all of which are herein incorporated by reference in their entirety). 

The invention also includes plants and plant parts, such as seed, oil and meal derived 

15 • from seed, and feed and food products processed from plants, which are enriched in tocopherols. 
Of particular interest is seed oil obtained from transgenic plants where the tocopherol level has 
been increased as compared to seed oil of a non-transgenic plant. 

The harvested plant material may be subjected to additional processing to further enrich 
the tocopherol content. The skilled artisan will recognize that there are many such processes or 

20 methods for refining, bleaching and degumming oil. United States Patent Number 5,932,261, 
issued August 3, 1999, discloses on such process, for the production of a natural carotene rich 
refined and deodorised oil by subjecting the oil to a pressure of less than 0.060 mbar and to a 
temperature of less than 200.degree. C. Oil distilled by this process has reduced free fatty acids, 
yielding a refined, deodorised oil where Vitamin E contained in the feed oil is substantially 

2 5 retained in the processed oil. The teachings of this patent are incorporated herein by reference. 

The invention now being generally described, it will be more readily understood by 
reference to the following examples which are included for purposes of illustration only and are 
not intended to limit the present invention. 

30 
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EXAMPLES 

Example 1: Identification ofPrenyltransIcrase or tocopherol cyclase Sequences 

5 PSI-BLAST (Altschul, et al (1997) Hue Acid Res 25:3389-3402) profiles were generated 

for both the straight chain and aromatic classes of prenyltransferases. To generate the straight 
chain profile, a prenyl- transferase from Porphyra purpurea (Genbank accession 1709766) was 
used as a query against the NCBI non-redundant protein database. The E. coli enzyme involved 
in the formation of ubiquinone, ubiA (genbank accession 1790473) was used as a starting 

10 sequence to generate the aromatic prenyltransferase profile. These profiles were used to search 
public and proprietary DNA and protein data bases. In Arabidopsis six putative 
prenyltransferases of the straight-chain class were identified, ATPT1, (SEQ ID NO:9), ATPT7 
(SEQ ID NO: 10), ATPT8 (SEQ ID NO:l 1), ATPT9 (SEQ ID NO: 13), ATPT10 (SEQ ID 
NO: 14), and ATPT1 1 (SEQ ID NO: 15), and six were identified of the aromatic class, ATPT2 

15 - (SEQ IDNO:l), ATPT3 (SEQ IDNO:3), ATPT4 (SEQ IDNO:5), ATPT5 (SEQ ID NO:7), 

ATPT6 (SEQ ID NO:8), and ATPT12 (SEQ ID NO:16). Additional prenyltransferase sequences 
from other plants related to the aromatic class of prenyltransferases, such as soy (SEQ ID NOs: 
1 9-23, the deduced amino acid sequence of SEQ ID NO:23 is provided in SEQ ID NO:24) and 
maize (SEQ ID NOs:25-29, and 31) are also identified. The deduced amino acid sequence of 

20 ZMPT5 (SEQ ID NO:29) is provided in SEQ ID NO:30. 

Searches are performed on a Silicon Graphics Unix computer using additional 
Bioaccellerator hardware and Gen Web software supplied by Compugen Ltd. This software and 
hardware enables the use of the Smith- Waterman algorithm in searching DNA and protein 
databases using profiles as queries. The program used to query protein databases is profilesearch. 

25 This is a search where the query is not a single sequence but a profile based on a multiple 

alignment of amino acid or nucleic acid sequences. The profile is used to query a sequence data 
set, i.e., a sequence database. The profile contains all the pertinent information for scoring each 
position in a sequence, in effect replacing the "scoring matrix" used for the standard query 
searches. The program used to query nucleotide databases with a protein profile is tprofilesearch. 

30 Tprofilesearch searches nucleic acid databases using an amino acid profile query. As the search is 
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running, sequences in the database arc translated to amino acid sequences in six reading frames. 
The output ilic for tprolllesearch is identical to the output file for profilesearch except for an 
additional column that indicates the frame in which the best alignment occurred. 

The Smith-Waterman algorithm, (Smith and Waterman ( 1 98 1 ) supra), is used to search 
5 for similarities between one sequence from the query and a group of sequences contained in the 
database. E score values as well as other sequence information, such as conserved peptide 
sequences are used to identify related sequences. 

To obtain the entire coding region corresponding to the Arab idops is prenyltransferase 
sequences, synthetic oligo-nucleotide primers are designed to amplify the 5' and 3' ends of 
10 partial cDNA clones containing prenyltransferase sequences. Primers are designed according to 
the respective Arabidopsis prenyltransferase sequences and used in Rapid Amplification of 
cDNA Ends (RACE) reactions (Frohman et al (1 988) Proc. Natl. Acad. Sci. USA 85:8998-9002) 
using the Marathon cDNA amplification kit (Clontech Laboratories Jnc, Palo Alto, CA). 

Amino acid sequence alignments between ATPT2 (SEQ ID NO:2), ATPT3 (SEQ ID 
15 - NO:4), ATPT4 (SEQ IDNO:6), ATPT8 (SEQ ID NO:12), and ATPT12 (SEQ IDNO:I7) are 
performed using ClustalW (Figure 1), and the percent identity and similarities are provided in 
Table 1 below. 
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15 


% similar 


25 


25 


22 


32 


% Gap 


17 


20 


20 


9 


ATPT3 % Identity 




12 


6 


22 


% similar 




29 


16 


38 


% Gap 




20 


24 


14 


ATPT4 % Identity 






9 


14 


% similar 






18 


29 


% Gap 






26 


19 


ATPT8 % Identity 
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% similar 19 
% Gap 20 
ATPT12 % Identity 

% similar 

% Gap 



Example 2: Preparation of Prenyl Transferase Expression Constructs 

A plasmid containing the napin cassette derived from pCGN3223 (described in USPN 
5,639,790, the entirety of which is incorporated herein by reference) was modified to make it 
5 more useful for cloning large DNA fragments containing multiple restriction sites, and to allow 
the cloning of multiple napin fusion genes into plant binary transformation vectors. An adapter 
comprised of the self annealed oligonucleotide of sequence 

CGCGATTTAAATGGCGCGCCCTGCAGGCGGCCGCCTGCAGGGCGCGCCATTTAAAT 
(SEQ ID NO:40) was ligated into the cloning vector pBC SK+ (Stratagene) after digestion with 
10 • the restriction endonuclease BssHII to construct vector pCGN7765. Plamids pCGN3223 and 
pCGN7765 were digested with NotI and ligated together. The resultant vector, pCGN7770, 
contains the pCGN7765 backbone with the napin seed specific expression cassette from 
pCGN3223. 

The cloning cassette, pCGN7787, essentially the same regulatory elements as 
15 pCGN7770, with the exception of the napin regulatory regions of pCGN7770 have been replaced 
with the double CAMV 35S promoter and the tml polyadenylation and transcriptional 
termination region. 

A binary vector for plant transformation, pCGN5 1 39, was constructed from pCGN1558 
(McBride and Summerfelt, (1990) Plant Molecular Biology, 14:269-276). The polylinker of 
20 pCGNl 558 was replaced as a HindIII/Asp71 8 fragment with a polylinker containing unique 
restriction endonuclease sites, AscI, Pad, Xbal, Swal, BamHI, and Notl. The Asp718 and 
Hindlll restriction endonuclease sites are retained in pCGN5139. 

A series of turbo binary vectors are constructed to allow for the rapid cloning of DNA 
sequences into binary vectors containing transcriptional initiation regions (promoters) and 
2 5 transcriptional termination regions. 
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The plasrnid pCGNSGIS was constructed by ligating oligonucleotides 5'- 
TCGAGGATCCGCGGCCGCAAGCTITT:TGCAGG-3* (SEQ ID NO:4l) and 5'- 
TCGACCTGCAGGAAGCT^GCGGCCGCGGATCC-3 , (SEQ ID NO:42) into Snll/Xhol- 
digesled pCGN7770. A fragment containing the napin promoter, polylinkcr and napin 3 1 region 
5 was excised from pCGN861 8 by digestion with Asp71 81; the fragment was blunt-ended by filling 
in the 5' overhangs with KJenow fragment then ligated into pCGN5I39 that had been digested 
with Asp71 81 and Hindlll and blunt-ended by filling in the 5' overhangs with Klenow fragment. 
A plasrnid containing the insert oriented so that the napin promoter was closest to the blunted 
Asp71 81 site of pCGN5 1 39 and the napin 3' was closest to the blunted Hindlll site was subjected 

10 to sequence analysis to confirm both the insert orientation and the integrity of cloning junctions. 
The resulting plasrnid was designated pCGN8622. 

The plasrnid pCGM8619 was constructed by ligating oligonucleotides 5'- 
TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC -3' (SEQ ID NO:43) and 5- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3* (SEQ IDN0:44) into Sall/Xhol- 

15 - digested pCGN7770. A fragment containing the napin promoter, polylinker and napin 3' region 
was removed from pCGN8619 by digestion with Asp718I; the fragment was blunt-ended by 
filling in the 5' overhangs with Klenow fragment then ligated into pCGN5139 that had been 
digested with Asp718I and Hindlll and blunt-ended by filling in the 5* overhangs with Klenow 
fragment. A plasrnid containing the insert oriented so that the napin promoter was closest to the 

20 blunted Asp718I site of pCGN5139 and the napin 3' was closest to the blunted Hindlll site was 
subjected to sequence analysis to confirm both the insert orientation and the integrity of cloning 
junctions. The resulting plasrnid was designated pCGN8623. 

The plasrnid pCGN8620 was constructed by ligating oligonucleotides 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGGAGCT -3' (SEQ ID NO:45) and 5'- 

2 5 CCTGC AGGAAGCTTGCGGCCGCGG ATCC-3 5 (SEQ ID NO:46) into Sall/SacI-digested 
pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3 1 region was 
removed from pCGN8620 by complete digestion with Asp718I and partial digestion with Notl. 
The fragment was blunt-ended by filling in the 5' overhangs with Klenow fragment then ligated 
into pCGN5I39 that had been digested with Asp71 81 and Hindlll and blunt-ended by filling in 
? 0 2 :#? rUhaG o^Trhk^STMitti K?cK?2V failrfteni4^Ab4rffiA krnt^jf-aso <Jos£fePit. J or.^©erP \*mh£©+ £ 



WO 02/33060 PCT/US01/42673 

d35S promoter was closest to the blunted Asp7I8I site of pCGN5!39 and the tml 3* was closest 
to the blunted Hindlll site was subjected to sequence analysis to confirm both the insert 
orientation and the integrity ofdoning junctions. The resulting plasmid was designated 
pCGN8624. 

5 The plasmid pCGN8621 was constructed by ligating oligonucleotides 5 1 - 

TCGACCTGCAGGAAGCTTGCGGCCGCGGATCCAGCT -3' (SEQ ID NO:47) and 5'- 
OGATCCGCGGCCGCAAGCTTCCTGCAGG-3' (SEQ ID NO:48) into Sall/SacI-digested 
pCGN7787. A fragment containing the d35S promoter, polylinker and tml 3' region was 
removed from pCGN8621 by complete digestion with Asp7181 and partial digestion withNotl. 
10 The fragment was blunt-ended by filling in the 5' overhangs with Klenow fragment then ligated 
into pCGN5139 that had been digested with Asp718I and Hindlll and blunt-ended by filling in 
the 5 5 overhangs with Klenow fragment. A plasmid containing the insert oriented so that the 
d35S promoter was closest to the blunted Asp718I site of pCGN5139 and the tml 3' was closest 
to the blunted Hindlll site was subjected to sequence analysis to confirm both the insert 
.15 . orientation and the integrity of cloning junctions. The resulting plasmid was designated 
pCGN8625. 

The plasmid construct pCGN8640 is a modification of pCGN8624 described above. A 
938bp PslI fragment isolated from transposon Tn7 which encodes bacterial spectinomycin and 
streptomycin resistance (Fling et al. (1 985), Nucleic Acids Research 13(19):7095-71 06), a 
20 determinant for E. coli and Agrobacterium selection, was blunt ended with Pfu polymerase. The 
blunt ended fragment was ligated into pCGN8624 that had been digested with Spel and blunt 
ended with Pfu polymerase. The region containing the PstI fragment was sequenced to confirm 
both the insert orientation and the integrity of cloning junctions. 

The spectinomycin resistance marker was introduced into pCGN8622 and pCGN8623 as 
25 follows. A 7.7 Kbp Avrll-SnaBI fragment from pCGN8640 was ligated to a 10.9 Kbp Avrll- 
SnaBI fragment from pCGN8623 or pCGN8622, described above. The resulting plasmids were 
pCGN8641 and pCGN8643, respectively. 

The plasmid pCGN8644 was constructed by ligating oligonucleotides 5'- 
GATCACCTGCAGGAAGCTTGCGGCCGCGGATCCAATGCA-3' (SEQ ID NO:49) and 5'- 
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TTGGATCCGCGGCCGCAAGCTTCCTGCAGGT-3 , (SEQ IDNO:50) into Bam!II-PstI 
digested pCGN8640. 

Synthetic oligonulceotides were designed for use in Polymerase Chain Reactions (PCR) to 
amplify the coding sequences of ATPT2, ATPT3, ATPT4, ATPT8, and ATPT12 for the preparation 
5 of expression constructs and are provided in Table 2 below. 



Table 2: 


Name 


Restriction Site 


Sequence 


SEQ ID NO: 


ATPT2 


5' Notl 


GGATCCGCGGCCGCACAATGGAGTC 


51 






TCTGCTCTCTAGTTCT 




ATPT2 


3' Ssel 


GGATCCTGCAGGTCACTTCAAAAAA 


52 






GGTAACAGCAAGT 




ATPT3 


5' Noll 


GGATCCGCGGCCGCACAATGGCGTT 


53 






TTTTGGG CTCTCCCG' 1 G' 1 " I " r 




ATPT3 


3' Ssel 


GGATCCTGCAGGTTATTGAAAACTT 


54 






CTTCCAAGTACAACT 




ATPT4 


5' Notl 


GGATCCGCGGCCGCACAATGTGGCG 


55 






AAGATCTGTTGTT 




ATPT4 


3' Ssel 


GGATCCTGCAGGTCATGGAGAGTAG 


56 


4 

I * 




AAGGAAGGAGCT 




ATPT8 


5' Notl 


GGATCCGCGGCCGCACAATGGTACT 


57 






TGCCGAGGTTCCAAAGCTTGCCTCT 




ATPT8 


3' Ssel 


GGATCCTGCAGGTCACTTGTTTCTG 


58 






GTGATGACTCTAT 




ATPT12 


5' Notl 


GGATCCGCGGCCGCACAATGACTTC 


59 






GATTCTCAACACT 




ATPT12 


3' Ssel 


GGATCCTGCAGGTCAGTGTTGCGAT 


60 



GCTAATGCCGT 



The coding sequences of ATPT2, ATPT3, ATPT4, ATPT8, and ATPT12 were all amplified 
10 using the respective PCR primers shown in Table 2 above and cloned into the TopoTA vector 

(Invitrogen). Constructs containing the respective prenyltransferase sequences were digested with 
Notl and Sse8387I and cloned into the turbobinary vectors described above. 

The sequence encoding ATPT2 prenyltransferase was cloned in the sense orientation into 
pCGN8640 to produce the plant transformation construct pCGN 10800 (Figure 2). The ATPT2 
15 sequence is under control of the 35S promoter. 
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The ATPT2 sequence was also cloned in the antisensc orientation into the construct 
pCGN8641 to create pCGN 1 0801 (Figure 3). This construct provides Tor the antisense expression of 
the ATPT2 sequence from the napin promoter. 

The ATPT2 coding sequence was also cloned in the sense orientation into the vector 
5 pCGN8643 to create the plant transformation construct pCGN 10822 

The ATPT2 coding sequence was also cloned in the antisense orientation into the vector 
pCGN8644 to create the plant transformation construct pCGN 1 0803 (Figure 4). 

The ATPT4 coding sequence was cloned into the vector pCGN864 to create the plant 
transformation construct pCGN 10806 (Figure 5). The ATPT2 coding sequence was cloned into the 
10 vector TopoTA ™ vector from Invitrogen, to create the plant transformation construct 

pCGN10807(Figure 6). The ATPT3 coding sequence was cloned into the TopoTA vector to create 
the plant transformation construct pCGN 10808 (Figure 7). The ATPT3 coding sequence was cloned 
in the sense orientation into the vector pCGN8640 to create the plant transformation construct 
pCGN 10809 (Figure 8). The ATPT3 coding sequence was cloned in the antisense orientation into the 
1 5 . vector pCGN864 1 to create the plant transformation construct pCGNl 081 0 (Figure 9). The ATPT3 
coding sequence was cloned into the vector pCGN8643 to create the plant transformation construct 
pCGN1081 1 (Figure 10). The ATPT3 coding sequence was cloned into the vector pCGN8644 to 
create the plant transformation construct pCGN10812 (Figure 1 1). The ATPT4 coding sequence was 
cloned into the vector pCGN8640 to create the plant transformation construct pCGN10813 (Figure 
20 12). The ATPT4 coding sequence was cloned into the vector pCGN8641 to create the plant 

transformation construct pCGNl 08 14 (Figure 13). The ATPT4 coding sequence was cloned into the 
vector pCGN8643 to create the plant transformation construct pCGN10815 (Figure 14). The ATPT4 
coding sequence was cloned in the antisense orientation into the vector pCGN8644 to create the 
plant transformation construct pCGN10816 (Figure 15). The ATPT8 coding sequence was cloned in 

2 5 the sense orientation into the vector pCGN8643 to create the plant transformation construct 

pCGN10819 (Figure 17). The ATPT12 coding sequence was cloned into the vector pCGN8640 to . 
create the plant transformation construct pCGN 1 0824 (Figure 1 8). The ATPT 1 2 coding sequence 
was cloned into the vector pCGN8643 to create the plant transformation construct pCGN 10825 
(Figure 19). The ATPT8 coding sequence was cloned into the vector pCGN8640 to create the plant 

3 0 transformation construct pCGN 1 0826 (Figure 20). 
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Example 3: Plant Transformation with Prenyl Transferase Constructs 

Transgenic Drassica plants are obtained by Agrobacterhtm-mcdiated transformation as 
described by Radke et al. (Theor. Appi Genet. (1988) 75:685-694; Plant Cell Reports (1992) 
/ 7:499-505). Transgenic Arabidopsis ihaliana plants may be obtained by Agrobaderium- 
mediated transformation as described by Valverkens et al., (Proc. Nat. Acad ScL (1988) 
55:5536-5540), or as described by Bent et al. ((1994), Science 265: 1856-1 860), or Bechtold et al. 
((1993), CR.Acad.Sci, Life Sciences 316:1194-1199). Other plant species may be similarly 
transformed using related techniques. 

Alternatively, microprojectile bombardment methods, such as described by Klein et al 
{Bio/Technology 70:286-291) may also be used to obtain nuclear transformed plants. 

Example 4: Identification of Additional Prenyltransferases 

Additional BLAST searches were performed using the ATPT2 sequence, a sequence in 
the class of aromatic prenyltransferases. ESTs, and in some case, full-length coding regions, 
were identified in proprietary DNA libraries. 

Soy full-length homologs to ATPT2 were identified by a combination of BLAST (using 
ATPT2 protein sequence) and 5' RACE. Two homologs resulted (SEQ ID NO:95 and SEQ ID 
NO:96). Translated amino acid sequences are provided by SEQ ID NO:97 and SEQ ID NO:98. 

A rice est ATPT2 homolog is shown in SEQ ID NO:99 (obtained from BLAST using the 
wheat ATPT2 homolog). 

Other homolog sequences were obtained using ATPT2 and PSI-BLAST, including est 
sequences fr^n wheat (SEQ ID NO: 1 00), leek (SEQ ID NOs: 1 0 1 and 1 02), canola (SEQ ID 
NO: 103), com % (SEQ ID NOs: 104, 105 and 106), cotton (SEQ ID NO: 107) and tomato (SEQ ID 
NO: 108). 

A PSI-Blast profile generated using the E. coli ubiA (genbank accession 1790473) 
sequence was used to analyze the Synechocystis genome. This analysis identified 5 open reading 
frames (ORFs) in the Synechocystis genome that were potentially prenyltransferases; slr0926 
(annotated as ubiA (4-hydroxybenzoate-octaprenyltransferasc, SEQ ID NO:32), sill 899 
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(annotated as ctaB (cylocrome c oxidase folding protein. SEQ ID NO:33), slr0056 (annotated as 
g4 (chlorophyll synthase 33 kd subunit, SEQ ID NO:34), sir 1 5 1 8 (annotated as menA 
(mcnaquinonc biosynthesis protein, SliQ ID NO:35), and sir 1 736 (annotated as a hypothetical 
protein of unknown function (SEQ ID NO:36). 

5 

4A. Synechocyslis Knock-outs 

To determine the functionality of these ORFs and their involvement, if any, in the 
biosynthesis of tocopherols, knockouts constructs were made to disrupt the ORJF identified in 
Synechocystis. 

10 Synthetic oligos were designed to amplify regions from the 5* (5'- 

TAATGTGTACATTGTCGGCCTC (17365') (SEQ ID NO:61) and 5'- 

GCAATGTAACATCAGAGATTTrGAGACACAACGTGGCTTTCCACAATTCCCCGCACC 
GTC (1736kanprl)) (SEQ IDNO:62) and 3' (5'-AGGCTAATAAGCACAAATGGGA (17363 s ) 
(SEQ ID NO:63) and 5'-GGTATGAGTCAGCAACACCTTCTTCACGAGGCAGACCTCAGC 

1 5 . GGAATTGGTTTAGGTTATCCC (1 736kanpr2)) (SEQ ID NO:64) ends of the slrl 736 ORF. 
The 1736kanprl and 1736kanpr2 oligos contained 20 bp of homology to the slrl 736 ORF with 
an additional 40 bp of sequence homology to the ends of the kanamycin resistance cassette. 
Separate PCR steps were completed with these oligos and the products were "gel purified and 
combined with the kanamycin resistance gene from puc4K (Pharmacia) that had been digested 

20 with Hindi and gel purified away from the vector backbone. The combined fragments were 

i 

allowed to assemble without oligos under the following conditions: 94°C for 1 min, 55°C for 1 
min, 72°C for 1 min plus 5 seconds per cycle for 40 cycles using pfu polymerase in lOOul 
reaction volume (Zhao, H and Arnold (1997) Nucleic Acids Res. 25(6):1307-1308). One 
microliter or five microliters of this assembly reaction was then amplified using 5' and 3' oligos 
2 5 nested within the ends of the ORF fragment, so that the resulting product contained 100-200 bp 
of the 5' end of the Synechocystis gene to be knocked out, the kanamycin resistance cassette, and 
1 00-200 bp of the 3' end of the gene to be knocked out. This PCR product was then cloned into 
the vector pGemT easy (Promega) to create the construct pMON2l681 and used for 
Svnechocystis transformation. 
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Primers were also synlhcsized for the preparation of Synechocystis knockout constructs 
for the other sequences using the same method as described above, with the following primers. 
The ubiA 5* sequence was amplified using the primers 5 ? - GGATCCATGGTT 
GCCCAAACCCCATC (SEQ ID NO:65) and 5'- GCAATGTAACATCAGAGA 
5 TTTTGAGACACAACG TGGCTTrGGGTAAGCAACAATGACCGGC (SEQ ID NO:66). 
The 3' region was amplified using the synthetic oligonucleotide primers 5'- 
GAATTCTCAAAGCCAGCCCAGTAAC (SEQ ID NO:67) and S'-GGTATGAGTC 

AGCAACACCTTCTrCACGAGGCAGACCTCAGCGGGTGCGAAAAGGGTTTTCCC (SEQ 
ID NO:68). The amplification products were combined with the kanamycin resistance gene from 

10 puc4K (Pharmacia) that had been digested with Hindi and gel purified away from the vector 

backbone. The annealed fragment was amplified using 5' and 3' oligos nested within the ends of 
the ORF fragment (5'- CCAGTGGTTTAGGCTGTGTGGTC (SEQ ID NO:69) and 5'- 
CTGAGTTGGATGTATTGGATC (SEQ ID NO:70)), so that the resulting product contained 
100-200 bp of the 5 1 end of the Synechocystis gene to be knocked out, the kanamycin resistance 

15 ■ cassette, and 100-200 bp of the 3' end of the gene to be knocked out This PCR product was then 
cloned into the vector pGemT easy (Promega) to create the construct pMON21682 and used for 
Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout constructs 
for the other sequences using the same method as described above, with the following primers. 

20 The sll 1899 5' sequence was amplified using the primers 5*- GGATCCATGGTTACTT 
CGACAAAAATCC (SEQ ID NO:7 1 ) and 5 '- GCAATGTAACATCAGAG 
ATTTTGAGACACAACGTGGCTTTGCTAGGCAACCGCTTAGTAC (SEQ ID NO:72). The 
3' region was amplified using the synthetic oligonucleotide primers 5'- 

GAATTCTTAACCCAACAGTAAAGTTCCC (SEQ ID NO:73) and 5'- GGTATGAGTCAGC 
25 AACACCTTCTTCACGAGGCAGACCTCAGCGCCGGCATTGTCTTTTACATG (SEQ [D 
NO:74). The amplification products were combined with the kanamycin resistance gene from 
puc4K (Pharmacia) that had been digested with Hindi and gel purified away from the vector 
backbone. The annealed fragment was amplified using 5' and 3' oligos nested within the ends of 
the ORF fragment (5'- GGAACCCTTGCAGCCGCTTC (SEQ ID NO:75) 



34 



WO 02/33060 



PCT/USO 1/42673 



and 5 1 - GTATGCCCAACTGGTGCAGAGG (SEQ ID NO:76)), so that the resulting product 
contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked out, the kanamycin 
resistance cassette, and 100-200 bp of the 3' end of the gene to be knocked out. This PCR 
product was then cloned into the vector pGcniT easy (Promega) to create the construct 
5 pMON2 1 679 and used for Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout constructs 
for the other sequences using the same method as described above, with the following primers. 
The slr0056 5' sequence was amplified using the primers 5 J - 
GGATCCATGTCTGACACACAAAATACCG (SEQ ID NO:77) and 5'- 

10 GCAATGTAACATCAGAGATTTTGAGACACAACGTGGCTTTCGCCAATACCAGCCACC 
AACAG (SEQ ID NO:78). The 3' region was amplified using the synthetic oligonucleotide 
primers 5'- GAATTCTCAAAT CCCCGCATGGCCTAG (SEQ ID NO:79) and 5'- 
GGTATGAGTCAGCAACACCTTCTTCACGAGGCAGACCTCAGCGGCCTACGGCTTGGA 
CGTGTGGG (SEQ ID NO:80). The amplification products were combined with the kanamycin 

1 5 ■ resistance gene from puc4K (Pharmacia) that had been digested with Hindi and gel purified 
away from the vector backbone. The annealed fragment was amplified using 5* and 3 s oligos 
nested within the ends of the ORJF fragment (5'- CACTTGGATTCCCCTGATCTG (SEQ ID 
NO:81) and 5'- GCAATACCCGCTTGGAAAACG (SEQ ID NO:82)), so that the resulting 
product contained 1 00-200 bp of the 5' end of the Synechocystis gene to be knocked out, the 

20 kanamycin resistance cassette, and 100-200 bp of the 3' end of the gene to be knocked out. This 
PGR product was then cloned into the vector pGemT easy (Promega) to create the construct 
pMON2 1 677 and used for Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout constructs 
for the other sequences using the same method as described above, with the following primers. 

25 The slrl518 5' sequence was amplified using the primers 5'- GGATCCATGACCGAAT 
CTTCGCCCCTAGC (SEQ ID NO:83) and 5 * -GC A ATGTA AC ATC AG AG ATTTTG A 
GACACAACGTGGC TTTCAATCCTAGGTAGCCGAGGCG (SEQ ID NO:84). The 3* region 
was amplified using the synthetic oligonucleotide primers 5'- GAATTCTTAGCCCAGGCC 
AGCCCAGCC (SEQ ID NO:85)and 5'- GGTATGAGTCAGCAACACCTTCTTCACGA 

30 GGCAGACCTCAGCGGGGAATrGATTTGTTrAATTACC (SEQ ID NO:86). The 
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amplification products were combined with the kanamycin resistance gene from puc4K 
(Pharmacia) that had been digested with Hindi and gel purified away from the vector backbone. 
The annealed fragment was amplified using 5 1 and 3 1 oligos nested within the ends of the ORF 
fragment (5'- GCGATCGCCATT ATCGCTTGG (SEQ ID NO:87) and 5'- 
5 GCAGACTGGCAATTATCAGTAACG (SEQ ID NO:88)), so that the resulting product 

contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked out, the kanamycin 
resistance cassette, and 100-200 bp of the 3' end of the gene to be knocked out. This PCR 
product was then cloned into the vector pGemT easy (Promega) to create the construct 
pMON21680 and used for Synechocystis transformation. 

10 

4B. Transformation of Synechocystis 

Cells of Synechocystis 6803 were grown to a density of approximately 2x1 0 8 cells per ml 
and harvested by centrifugation. The cell pellet was re-suspended in fresh BG-1 1 medium 
(ATCC Medium 616) at a density of IxlO 9 cells per ml and used immediately for transformation. 

15 . One-hundred microliters of these cells were mixed with 5 ul of mini prep DNA and incubated 
with light at 30C for 4 hours. This mixture was then plated onto nylon filters resting on BG-1 1 
agar supplemented with TES pH8 and allowed to grow for 12-1 8 hours. The filters were then 
transferred to BG-1 1 agar + TES + 5ug/ml kanamycin and allowed to grow until colonies 
appeared within 7-10 days (Packer and Glazer, 1988). Colonies were then picked into BG-1 1 

20 liquid media containing 5 ug/ml kanamycin and allowed to grow for 5 days. These cells were 
then transferred to Bg-1 1 media containing lOug/ml kanamycin and allowed to grow for 5 days 
and then transferred to Bg-1 1 + kanamycin at 25ug/ml and allowed to grow for 5 days. Cells 
were then harvested for PCR analysis to determine the presence of a disrupted ORF and also for 
HPLC analysis to determine if the disruption had any effect on tocopherol levels. 

25 PCR analysis of the Synechocystis isolates for slrl736 and sill 899 showed complete 

segregation of the mutant genome, meaning no copies of the wild type genome could be detected 
in these strains. This suggests that function of the native gene is not essential for cell function. 
HPLC analysis of these same isolates showed that the sill 899 strain had no detectable reduction 
in tocopherol levels. However, the strain carrying the knockout for slrl 736 produced no 

3 0 detectable levels of tocopherol. 
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The amino acid sequences for the Synac/iocysfis knockouts are compared using ClustniW. 
and are provided in Table 3 below. Provided arc Ihe percent identities, percent similarity, and the 
percent gap. The alignment of the sequences is provided in Figure 21 . 



5 Table 3: 



S1H736 


slr0926 


sill 899 


slr0056 


sir 1518 


slrl736 %identity 


14 


12 


18 


1 1 


%similar 


29 


30 


34 


26 


%gap 


g 


7 


10 


5 


slr0926 %identity 




20 


19 


14 


%similar 




39 






%gap 




7 


9 


4 


sllI8?9%identity 






17 


13 


%similar 






29 


29 


%gap 






12 


9 


slr0056 %identity 








15 


%similar 








31 


%gap 








8 


slrl518%identity 










%similar 










%gap 











Amino acid sequence comparisons are performed using various Arabidopsis 
prenyl transferase sequences and the Synechocystis sequences. The comparisons are presented in 
Table 4 below. Provided are the percent identities, percent similarity, and the percent gap. The 
10 alignment of the sequences is provided in Figure 22. 



Table 4: 



ATPT2 sir 1 736 


ATPT3 


slr0926 


ATPT4 


sill 899 


ATPTI2 


slr0056 


ATPT8 


slr!5l8 


ATPT2 29 


9 


9 


8 


8 


12 


9 


7 


9 
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46 23 21 20 20 28 23 21 20 

27 13 28 23 29 11 24 25 24 

slrl 736 9 13 8 12 13 1 5 8 1 0 

19 28 19 28 26 33 21 26 

34 12 34 15 26 10 12 10 

ATPT3 23 11 14 13 10 5 11 

36 26 26 26 21 14 22 

29 21 31 16 30 30 30 

12 20 17 20 11 14 

slr0926 2 4 3 7 2 8 3 3 24 29 

33 12 25 10 11 9 

18 11 8 6 7 

ATPT4 33 23 18 16 19 

28 19 32 32 33 

13 17 10 12 

* 

sll 1 899 24 30 23 26 

27 13 10 11 

52 8 11 

ATPTl 66 19 26 

2 

18 25 23 

9 13 

slr0056 23 32 

10 8 
7 

ATPT8 23 

7 

slrl S 1 8 
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4C. Phytyl Prenyltransferase Enxyme Assays 

[ 3 H] Homogentisic acid in 0. 1% H3PO4 (specific radioactivity 40 Ci/mmo!). Phytyl 
5 pyrophosphate was synthesized as described by Joo, et al. (1 973) Can J, Biochem. 5 1 : 1527. 2- 
methyl-6-phytylquinol and 2,3-dimethyl-5-phytylquinol were synthesized as described by Soil, et 
al (1980) Phytochemistry 19:21 5. Homogentisic acid, a, p, 5, and y-tocopherol, and tocol, were 

purchased commercially. 

The wild-type strain otSynechocysiis sp. PCC 6803 was grown in BG1 1 medium with 

10 bubbling air at 30°C under 50 liE.irfV 1 fluorescent light, and 70% relative humidity. The growth 
medium of slrl736 knock-out (potential PPT) strain of this organism was supplemented with 25 
|ig ml/ 1 kanamycin. Cells were collected from 0.25 to 1 liter culture by centrifugation at 5000 g 
for 10 min and stored at -80°C. 
" • - Total membranes were isolated according to Zak's procedures with some modifications (Zak, 

15 et al (1999) Eur J. Biochem 261 :3 1 1 ). Cells were broken on a French press. Before the French 
press treatment, the cells were incubated for 1 hour with lysozyme (0.5% 3 w/v) at 30 °C in a 
medium containing 7 mM EDTA, 5 mMNaCl and 10 mM Hepes-NaOH, pH 7.4. The 
spheroplasts were collected by centrifugation at 5000 g for 10 min and resuspended at 0.1 - 0.5 mg 
chlorophyll -ml/ 1 in 20 mM potassium phosphate buffer, pH 7.8. Proper amount of protease 

20 inhibitor cocktail and DNAase I from Boehringer Mannheim were added to the solution. French 
press treatments were performed two to three times at 100 MPa. After breakage, the cell 
suspension was centrifuged for 10 min at 5000g to pellet unbroken cells, and this was followed by 
centrifugation at 100 000 g for 1 hour to collect total membranes. The final pellet was resuspended 
in a buffer containing 50 mM Tris-HCL and 4 mM MgCb. 

2 5 Chloroplast pellets were isolated from 250 g of spinach leaves obtained from local markets. 

Devined leaf sections were cut into grinding buffer (2 I /250 g leaves) containing 2 mM EDTA, 1 
mM MgCb, 1 mM MnCh, 0.33 M sorbitol, 0.1% ascorbic acid, and 50 mM Hepes at pH 7.5. The 
leaves were homogenized for 3 sec three times in a 1-L blendor, and filtered through 4 layers of 
mirocloth. The supernatant was then centrifuged at 5000# for 6 min. The chloroplast pellets were 
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rcsuspcndcd in small amount of grinding buffer (Dougcc/ al Methods in Chloroplast Molecular 
Biology, 239 (1982) 

Chloroplasts in pellets can be broken in three ways. Chloroplast pellets were first aliquoted 
in 1 mg of chlorophyll per tube, centrifuged at 6000 rpm for 2 min in microcentrifuge, and 
5 grinding buffer was removed. Two hundred microliters of Triton X-100 buffer (0.1% Triton X- 
100, 50 mM Tris-HCl pH 7.6 and 4 mM MgCI 2 ) or swelling buffer (10 mM Tris pH 7.6 and 4 
mM MgCh) was added to each tube and incubated for X A hour at 4°C. Then the broken 
chloroplast pellets were used for the assay immediately. In addition, broken chloroplasts can also 
be obtained by freezing in liquid nitrogen and stored at -80°C for Vi hour, then used for the assay. 

10 In some cases chloroplast pellets were further purified with 40%/ 80% percoll gradient to 
obtain intact chloroplasts. The intact chloroplasts were broken with swelling buffer, then either 
used for assay or further purified for envelope membranes with 20.5%/ 3 1 .8% sucrose density 
gradient (Sol, et al (1980) supra). The membrane fractions were centrifuged at 100 OOOg for 40 
min and resuspended in 50 mM Tris-HCl pH 7.6, 4 mM MgCb- 

15 - : - Various amounts of [ 3 H]HGA, 40 to 60 jiM unlabelled HGA with specific activity in the 

range of 0.16 to 4 Ci/mmole were mixed with a proper amount of 1M Tris-NaOH pH 10 to adjust 
pH to 7.6. HGA was reduced for 4 min with a trace amount of solid NaBHj. In addition to HGA, 
standard incubation mixture (final vol 1 mL) contained 50 mM Tris-HCl, pH 7.6, 3-5 mM MgCh, 
and 1 00 \iM phytyl pyrophosphate. The reaction was initiated by addition of Synechocystis total 

20 membranes, spinach chloroplast pellets, spinach broken chloroplasts, or spinach envelope 

membranes. The enzyme reaction was carried out for 2 hour at 23°C or 30°C in the dark or light. 
The reaction is stopped by freezing with liquid nitrogen, and stored at -80°C or directly by 
extraction. 

A constant amount of tocol was added to each assay mixture and reaction products were 
25 extracted with a 2 mL mixture of chloroform/methanol (1 :2, v/v) to give a monophasic solution. 
NaCI solution (2 mL; 0.9%) was added with vigorous shaking. This extraction procedure was 
repeated three times. The organic layer containing the prenylquinones was filtered through a 20 
mji filter, evaporated under and then resuspended in 100 |uL of ethanol. 

The samples were mainly analyzed by Normal-Phase HPLC method (Isocratic 90% Hexane 
30 and 1 0% Mcthyl-t-butyl ether), and use a Zorbax silica column, 4.6 x 250 mm. The samples were 
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also analyzed by Reversetl-Phasc HPLC method (Isocratic 0. 1% 1 I3PO4 in McOH), and use a 
Vydac 201 1 IS54 C 1 8 column; 4.6 x 250 mm coupled with an All-tech CIS guard column. The 
amount of products were calculated based on the substrate specific radioactivity, and adjusted 
* according to the % recovery based on the amount of internal standard. 
5 The amount of chlorophyll was determined as described in Arnon (1949) Plant Physiol. 24:1. 

Amount of protein was determined by the Bradford method using gamma globulin as a standard 
(Bradford, (1 976) Anal Biochem. 72:248) 

Results of the assay demonstrate that 2-MethyI-6-Phytylplastoquinone is not produced in 
the Synechocystis sir 1736 knockout preparations. The results of the phytyl prenyltransferase 
1 0 enzyme activity assay for the slrl 736 knock out are presented in Figure 23. 

4D. Complementation of the slrl 736 knockout with ATPT2 

In order to determine whether ATPT2 could complement the knockout of slrl 736 in 
Synechocystis 6803, a plasmid was constructed to express the ATPT2 sequence from the TAC 

15 - promoter. A vector, plasmid psll 21 1, was obtained from the lab of Dr. Himadri Pakrasi of 
Washington University, and is based on the plasmid RSF1010 which is a broad host range 
plasmid (Ng W.-CX, Zentella R., Wang, Y., Taylor J-S. A., Pakrasi, H.B. 2000. phrA, the major 
photoreactivating factor in the cyanobacterium Synechocystis sp. strain PCC 6803 codes for a 
cyclobutane pyrimidine dimer specific DNA photolyase. Arch. Microbiol, (in press)). The 

20 ATPT2 gene was isolated from the vector pCGN10817 by PCR using the following primers. 
ATPT2nco.pr 5 '-CC ATGG ATTCGAGTAAAGTTGTCGC (SEQ ID NO:89); ATPT2ri.pr- 5'- 
GAATTCACTTCAAAAAAGGTAACAG (SEQ ID NO:90). These primers will remove 
approximately 1 12 BP from the 5' end of the ATPT2 sequence, which is thought to be the 
chloroplast transit peptide. These primers will also add an Ncol site at the 5' end and an EcoRI 

25 site at the 3 5 end which can be used for sub-cloning into subsequent vectors. The PCR product 
from using these primers and pCGN1081 7 was ligated into pGEM T easy and the resulting 
vector pMON21689 was confirmed by sequencing using the ml3forward and ml3reverse 
primers. The NcoI/EcoRI fragment from pMON21689 was then ligated with the Eagl/EcoRI 
and Eagl/Ncol fragments from psI1211 resulting in pMON21690. The plasmid pMON21690 

30 was introduced into the slrl 736 Synechocystis 6803 KO strain via conjugation. Cells of sl906 (a 
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helper strain) and DM 1013 cells containing pMON21690 were grown io log phase (O.D. 600= 
0.4) and I ml was harvested by ccntrifugation. The cell pellets were washed twice with a sterile 
BG-1 I solution and resuspended in 200 ul of BG-1 1. The following was mixed in a sterile 
eppendorf tube: 50 ul SL906, 50 ul DJ 1 1 OB cells containing pMON2 1 690, and 1 00 ul of a fresh 
5 culture of the slrl 736 Symchocyslis 6803 KO strain (O.D. 730 - 0.2-0.4). The cell mixture was 
immediately transferred to a nitrocellulose filter resting on BG-1 1 and incubated for 24 hours at 
30C and 2500 LUX(50 ue) of light. The filter was then transferred to BG-1 1 supplemented with 
lOug/ml Gentamycin and incubated as above for -5 days. When colonies appeared, they were 
picked and grown up in liquid BG-1 1 + Gentamycin 10 ug/ml. (Elhai, J. and Wolk, P. 1988. 

10 Conjugal transfer of DNA to Cyanobacteria. Methods in Enzymology 167, 747-54) The liquid 
cultures were then assayed for tocopherols by harvesting 1ml of culture by centrifugation, 
extracting with ethanol/pyrogallol, and HPLC separation. The slrl 736 Symchocyslis 6803 KO 
strain, did not contain any detectable tocopherols, while the slrl 736 Synechocyslis 6803 KO 
strain transformed with pmon21690 contained detectable alpha tocopherol. A Synechocyslis 

1 5 - 6$0 3 strain transformed with psl 1 2 1 1 (vector control) produced alpha tocopherol as well. 

4E: Additional Evidence of Prenyl transferase Activity 



20 phytyl prenyltransferase activity, both genes were expressed in SF9 cells and in yeast. When 
either sir 1736 or ATPT2 were expressed in insect cells (Table 5) or in yeast, phytyl 
prenyltransferase activity was detectable in membrane preparations, whereas membrane 
preparations of the yeast vector control, or membrane preparations of insect cells did not exhibit 
phytyl prenyltransferase activity. 



To test the hypothesis that slrl 736 or ATPT2 are sufficient as single genes to obtain 



25 



Tabic 5: Phytyl prenyltransferase activity 



Enzyme source 



Enzyme activity 
[pmol/mgx h| 



slrl 736 expressed in SF9 cells 
ATPT2 expressed in SF9 cells 
SF9 cell control 



20 
6 

<0.05 
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Synechocystis 6803 
Spinach chloroplasts 



0.25 
0.20 



Example 5: Transgenic Plant Analysis 
5A. Arabichpsis 

ArabidopsLs plants transformed with constructs for the sense or antisense expression of 
the ATPT proteins were analyzed by High Pressure Liquid Chromatography (HPLC) for altered 
levels of total tocopherols, as well as altered levels of specific tocopherols (alpha, beta, gamma, 
and delta tocopherol). 

Extracts of leaves and seeds were prepared for HPLC as follows. For seed extracts, 10 
mg of seed was added to 1 g of microbeads (Biospec) in a sterile microfuge tube to which 500 ul 
1% pyrogallol (Sigma Chem)/ethanoI was added. The mixture was shaken for 3 minutes in a 
mini Beadbeater (Biospec) on "fast" speed. The extract was filtered through a 0.2 um filter into 
an autosampler tube. The filtered extracts were then used in HPLC analysis described below. 

Leaf extracts were prepared by mixing 30-50 mg of leaf tissue with 1 g microbeads and 
freezing in liquid nitrogen until extraction. For extraction, 500 ul 1% pyrogallol inethanol was 
added to the leaf/bead mixture and shaken for 1 minute on a Beadbeater (Biospec) on "fast" 
speed. The resulting mixture was centrifuged for 4 minutes at 14,000 rpm and filtered as 
described above prior to HPLC analysis. 

HPLC was performed on a Zorbax silica HPLC column (4.6 mm X 250 mm) with a 
fluorescent detection, an excitation at 290 nm, an emission at 336 nm, and bandpass and slits. 
Solvent A was hexane and solvent B was methyl-t-butyl ether. The injection volume was 20 ul, 
the flow rate was 1.5 ml/min, the run time was 12 min (40°C) using the gradient (Table 6): 



Table 6: 



Time 



Solvent A 



Solvent B 



0 min. 



90% 



10% 



10 min. 



90% 



10% 



1 1 min. 



25% 



75% 



12 min. 



90% 



10% 
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Tocopherol standards in 1% pyrogallol/ cthanol were also run for comparison (alpha 
tocopherol, gamma tocopherol, beta tocopherol, delta tocopherol, and tocopherol (tocol) (all from 
Matrcya). 

5 Standard curves for alpha, beta, delta, and gamma tocopherol were calculated using 

Chemstation software. The absolute amount of component x is: Absolute amount of x= 
Response x x RF X x dilution factor where Response x is the area of peak x, RF X is the response 
factor for component x (Amount x /Response x ) and the dilution factor is 500 ul. The ng/mg tissue 
is found by: total ng component/mg plant tissue. 

10 Results of the HPLC analysis of seed extracts of transgenic Arabidopsis lines containing 

pMON 10822 for the expression of ATPT2 from the napin promoter are provided in Figure 24. 

HPLC analysis results of segregating T2 Arabidopsis seed tissue expressing the ATPT2 
sequence from the napin promoter (pCGN 10822) demonstrates an increased level of tocopherols 
in the seed. Total tocopherol levels are increased as much as 50% over the total tocopherol 

1 5 - levels of non-transformed (wild-type) Arabidopsis plants (Figure 25). Homozygous progeny 
from the top 3 lines (T3 seed) have up to a two-fold (100%) increase in total tocopherol levels 
over control Arabidopsis seed ( Figure 26.) 

Furthermore, increases of particular tocopherols are also increased in transgenic 
Arabidopsis plants expressing the ATPT2 nucleic acid sequence from the napin promoter. 

20 Levels of delta tocopherol in these lines are increased greater than 3 fold over the delta 

tocopherol levels obtained from the seeds of wild type Arabidopsis lines. Levels of gamma 
tocopherol in transgenic Arabidopsis lines expressing the ATPT2 nucleic acid sequence are 
increased as much as about 60% over the levels obtained in the seeds of non-transgenic control 
lines. Furthermore, levels of alpha tocopherol are increased as much as 3 fold over those 

25 obtained from non-transgenic control lines. 

Results of the HPLC analysis of seed extracts of transgenic Arabidopsis lines containing 
pCGN 1 0803 for the expression of ATPT2 from the enhanced 35S promoter (antisense 
orientation ) are provided in Figure 25. Two lines were identified that have reduced total 
tocopherols, up to a ten-fold decrease observed in T3 seed compared to control Arabidopsis 

30 (Figure 27.) 
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5B. Canola 

Brassica napus, variety SP30021, was transformed with pCGN 10822 (napin-ATPT2- 
napin 3\ sense orientation) using Agrobacterium ///we/f/tvmv-mediatcd transformation. Flowers 
5 of the R0 plants were tagged upon pollination and developing seed was collected at 35 and 45 
days after pollination (DAP). 

Developing seed was assayed for tocopherol levels, as described above for Arabidopsis. 
Line 10822-1 shows a 20% increase of total tocopherols, compared to the wild-type control, at 45 
DAP. Figure 28 shows total tocopherol levels measured in developing canola seed. 

10 

Example 6: Sequences to Tocopherol Cyclase 
6A. Preparation of the sir 1 737 Knockout 

The Synechocysiis sp, 6803 slrl737 knockout was constructed by the following method. 
The GPS™- 1 Genome Priming System (New England Biolabs) was used to insert, by a Tn7 

15 ■ Transposase system, a Kanamycin resistance cassette into sir 1737. A plasmid from a 

Synechocystis genomic library clone containing 652 base pairs of the targeted orf (Synechcocystis 
genome base pairs 1324051 - 1324703; the predicted orf base pairs 1323672- 1324763, as 
annotated by Cyanobase) was used as target DNA. The reaction was performed according to the 
manufacturers protocol. The reaction mixture was then transformed into E. coli DH10B 

2 0 electrocompetant cells and plated. Colonies from this transformation were then screened for 
transposon insertions into the target sequence by amplifying with M l 3 Forward and Reverse 
Universal primers, yielding a product of 652 base pairs plus -1700 base pairs, the size of the 
transposon kanamycin cassette, for a total fragment size of ^2300 base pairs. After this 
determination, it was then necessary to determine the approximate location of the insertion 

2 5 within the targeted orf, as 1 00 base pairs of orf sequence was estimated as necessary for efficient 
homologous recombination in Synechocystis. This was accomplished through amplification 
reactions using either of the primers to the ends of the transposon, Primer S (5* end) or N (3' 
end), in combination with either a Ml 3 Forward or Reverse primer. That is, four different primer 
combinations were used to map each potential knockout construct: Primer S — Ml 3 Forward, 

30 Primer S- Ml 3 Reverse, Primer N-M13 Forward, Primer N-M13 Reverse. The construct 
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used to transform Synvchocysiis and knockout slrl 737 was determined to consist ofa 
approximately 150 base pairs oi" sir I 737 sequence on the 5' side of the transposon insertion and 
approximately 500 base pairs on the 3' side, with the transcription of the orf and kanamycin 
cassette in the same direction. The nucleic acid sequence ofslrl737 is provided in SEQ ID 
5 NO:38 the deduced amino acid sequence is provided in SEQ ID NO:39. 

Cells of Synechocystis 6803'were grown to a density of - 2xl0 8 cells per ml and 
harvested by centrifugation. The cell pellet was re-suspended in fresh BG-1 1 medium at a 
density of 1x10? cells per ml and used immediately for transformation. 100 ul of these cells were 
mixed with 5 ul of mini prep DNA and incubated with light at 30C for 4 hours. This mixture 

10 was then plated onto nylon filters resting on BG-1 1 agar supplemented with TES ph8 and 
allowed to grow for 12-1 8 hours. The filters were then transferred to BG-1 1 agar + TES + 
Sug/ml kanamycin and allowed to grow until colonies appeared within 7-10 days (Packer and 
Glazer, 1988). Colonies were then picked into BG-1 1 liquid media containing 5 ug/ml 
kanamycin and allowed to grow for 5 days. These cells were then transferred to Bg-1 1 media 

1 5 - cpntaining lOug/ml kanamycin and allowed to grow for 5 days and then transferred to Bg-1 1 + 
kanamycin at 25ug/ml and allowed to grow for 5 days. Cells were then harvested for PCR 
analysis to determine the presence of a disrupted ORF and also for HPLC analysis to determine if 
the disruption had any effect on tocopherol levels. 

PCR analysis of the Synechocystis isolates, using primers to the ends of the slrl 737 orf, 

2 0 showed complete segregation of the mutant genome, meaning no copies of the wild type genome 
could be detected in these strains. This suggests that function of the native gene is not essential 
for cell function. HPLC analysis of the strain carrying the knockout for slrl 737 produced no 
detectable levels of tocopherol. 

25 6B. The relation ofslrl737 and slrl 736 

The slrl 737 gene occurs in Synechocystis downstream and in the same orientation as 
slrl 736, the phytyl prenyltransferase. In bacteria this proximity often indicates an operon 
structure and therefore an expression pattern that is linked in all genes belonging to this operon. 
Occasionally such operons contain several genes that are required to constitute one enzyme. To 

30 confirm that slrl 737 is not required for phytyl prenyltransferase activity, phytyl prenyltransferase 
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was measured in extracts from the Synecfiocystis sir! 737 knockout mutant. Figure 29 shows that 
extracts from the Synechocystis sir 1 737 knockout mutant still contain phytyl prenyl transferase 
activity. The molecular organization of genes in Synechocystis 6803 is shown in A. Figures B 
and C show HPLC traces (normal phase HPLC) of reaction products obtained with membrane 
5 preparations from Synechocystis wild type and slrl 737" membrane preparations, respectively. 

The fact that slrl 737 is not required for the PPT activity provides additional data that 
ATPT2 and slrl 736 encode phytyl prenyltransferases. 

6C Synechocystis Knockouts 

10 Synechocystis 6803 wild type and Synechocystis slrl 737 knockout mutant were grown 

photoautotrophically. Cells from a 20 ml culture of the late logarithmic growth phase were 
harvested and extracted with ethanoh Extracts were separated by isocratic normal-phase HPLC 
using a Hexane/Methyl-t-butyl ether (95/5) and a Zorbax silica column, 4.6 x 250 mm. 
Tocopherols and tocopherol intermediates were detected by fluorescence (excitement 290 nm, 

15 - emission 336 nm) (Figure 30). 

Extracts of Synechocystis 6803 contained a clear signal of alpha-tocopherol. 2,3- 
Dimethyl-5-phytylplastoquinol was below the limit of detection in extracts from the 
Synechocystis wild type (C). In contrast, extracts from the Synechocystis slrl 737 knockout 
mutant did not contain alpha-tocopherol, but contained 2,3-dimethyl-5-phytylplastoquinol (D), 

20 indicating that the interruption of slrl737 has resulted in a block of the 2,3-dimethyl-5- 
phytylplastoquinol cyclase reaction. 

Chromatograms of standard compounds alpha, beta, gamma, delta-tocopherol and 2,3- 
dimethyl-5-phytylplastoquinol are shown in A and B. Chromatograms of extracts form 
Synechocystis wild type and the Synechocystis slrl 737 knockout mutant are shown in C and D, 

2 5 respectively. Abbreviations: 2,3-DMPQ, 2,3-dimethyl-5-phytylplastoquinoI. 

6D. Incubation with Lysozyme treated Synechocystis 

Synechocystis 6803 wild type and slrl 737 knockout mutant cells from the late logarithmic 
growth phase (approximately Ig wet cells per experiment in a total volume of 3 ml) were treated 
30 with Lysozyme and subsequently incubated with S-adenosylmethionine, and 
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phylylpyrophosphate. plus rncliolabelled hotnogentisic acid. After 1 7h incubation in the dark at 
room temperature the samples were extracted with 6 ml chloroform / methanol (1/2 v/v). Phase 
separation was obtained by the addition of 6 ml 0.9% NaCI solution. This procedure was 
repeated three times. Under these conditions 2J-dimethyl-5-phytyIpIastoquinol is oxidized to 
5 form 2,3-dimethyl-5-phytylpIastoquinone. 

The extracts were analyzed by normal phase and reverse phase HPLC. Using extracts 
from wild type Synechocystis cells radiolabeled gamma-tocopherol and traces of radiolabeled 
2 5 3-dimethyl-5-phytylplastoquinone were detected. When extracts from the slrl737 knockout 
mutant were analyzed, only radiolabeled 2,3-dimethyl-5-phytylplastoquinone was detectable. 
10 The amount of 2,3-dimethyl-5-phytylplastoquinone was significantly increased compared to wild 
type extracts. Heat treated samples of the wild type and the sir 1737 knockout mutant did not 
produce radiolabeled 2,3-dimethyl-5-phytyIpIastoquinone 3 nor radiolabeled tocopherols. These 
results further support the role of the sir 1737 expression product in the cyclization of 2,3- 
dimethyl-5-phytylplastoquinol. 

15 • 

6E. Arabidopsis Homologue to slrl737 

An Arabidopsis homologue to slrl737 was identified from a BLAST ALL search using 
Synechocystis sp 6803 gene slrl737 as the query, in both public and proprietary databases. SEQ 
ID NO: 109 and SEQ ID NO:l 10 are the DNA and translated amino acid sequences, respectively, 
20 of the Arab idopsis homologue to sir 1737. The start if found at the ATG at base 56 in SEQ ID 
NO:109. 

The sequences obtained for the homologue from the proprietary database differs from the 
public database (F4D1 1.30, BAC AL022537), in having a start site 471 base pairs upstream of 
the start identified in the public sequence. A comparison of the public and proprietary sequences 
25 is provided in Figure 3 1 . The correct start correlates within the public database sequence is at 
12080, while the public sequence start is given as being at 1 1609. 

Attempts to amplify a slrl 737 homologue were unsuccessful using primers designed from 
the public database, while amplification of the gene was accomplished with primers obtained 
from SEQ ID NO: 109. 
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Analysis of the protein sequence to identify transit peptide sequence predicted two 
potential cleavage sites, one between amino acids 48 and 49, and the other between amino acids 
98 and 99. 

5 6F. slrl 737 Protein Information 

The slrl 737 orf comprises 363 amino acid residues and has a predicted MW of 41kDa 
(SEQ ID NO: 39). Hydropathic analysis indicates the protein is hydrophillic (Figure 32). 

The Arabidopsis homologue to slrl 737 (SEQ ID xx) comprises 488 amino acid residues, 
has a predicted MW of 55kDa, and a has a putative transit peptide sequence comprising the first 

10 98 amino acids. The predicted MW of the mature form of the Arabidopsis homologue is 44kDa. 
The hydropathic plot for the Arabidopsis homologue also reveals that it is hydrophillic (Figure 
33). Further blast analysis of the Arabidopsis hoihologue reveals limited sequence identity (25 % 
sequence identity) with the beta-subunit of respiratory nitrate reductase. Based on the sequence 
identity to nitrate reductase s it suggests the slrl 737 orf is an enzyme that likely involves general 

15 • acid catalysis mechanism. 

Investigation of known enzymes involved in tocopherol metabolism indicated that the 
best candidate corresponding to the general acid mechanism is the tocopherol cyclase. There are 
many known examples of cyclases including, tocopherol cyclase, chalcone isomerase, lycopene 
cyclase, and aristolochene synthase. By further examination of the microscopic catalytic 

20 mechanism of phytoplastoquinol cyclization, as an example, chalcone isomerase has a catalytic 
mechanism most similar to tocopherol cyclase. (Figure 34). 

Multiple sequence alignment was performed between slrl 737, s\x\131 Arabidopsis 
homologue and the Arabidopsis chalcone isomerase (Genbank:P41088) (Figure 35). 65% of the 
conserved residues among the three enzymes are strictly conserved within the known chalcone , 

25 isomerases. The crystal structure of alfalfa chalcone isomerase has been solved (Jez, Joseph M., 
Bowman, Marianne E., Dixon, Richard A., and Noel, Joseph P. (2000) Structure and 
mechanism of the evolutionarily unique plant enzyme chalcone isomerase". Nature Structural 
Biology 7: 786-791 .) It has been demonstrated tyrosine (Y) 106 of the alfalfa chalcone 
isomerase serves as the general acid during cyclization reaction (Genbank: P28012). The 
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equivalent residue in slrl 737 and Ihe slrl 737 Arahiclopsis homolog is lysine (K), which is an 
excellent catalytic residue as general acid. 

The information available from partial purification of tocopherol cyclase from Chlorella 
protothecokfes (U.S. Patent No. 5,432,069), i.e.. described as being glycine rich, water soluble 
and with a predicted MW of 48-50kDa, is consistent with the protein informatics information 
obtained for the slrT737 and the Arabidopsis slrl 737 homologue. 

All publications and patent applications mentioned in this specification are indicative of 
the level of skill of those skilled in the art to which this invention pertains. All publications and 
patent applications are herein incorporated by reference to the same extent as if each individual 
publication or patent application was specifically and individually indicated to be incorporated by 
reference. 

Although the foregoing invention has been described in some detail by way of illustration 
and example for purposes of clarity of understanding, it will be obvious that certain changes and 
modifications may be practiced within the scope of the appended claim. 
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CLAIMS 

What is claimed is: 

I. An isolated nucleic acid sequence encoding a prenyllransferase. 

5 2. An isolated nucleic acid sequence according to Claim 1, wherein said prenyllransferase is 

selected from the group consisting of straight chain prenyltransferase and aromatic prenyltransferase. 

3. An isolated DNA sequence according to Claim 1, wherein said nucleic acid sequence is 
isolated from a eukaryotic cell source. 

4. An isolated DNA sequence according to Claim 3, wherein said eukaryotic cell source is 
1 0 selected from the group consisting of mammalian, nematode, fungal, and plant cells. 

5. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from 
Arabidopsis. 

6. The DNA encoding sequence of Claim 5 wherein said prenyltransferase protein is encoded by 
a sequence selected from the group consisting of SEQ ID NO:l, SEQ TD NO;3, SEQ ID NO:5, SEQ ID 

1 5 - NO:7, SEQ ID NO:8, SEQ ID NO:9 3 SEQ ID NO: 1 0, SEQ ID NO: 1 1 , SEQ ID NO: 1 3, SEQ ID NO: 1 4, 
SEQ ID NO: 1 5, and SEQ ID NO: 1 6. 

7. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from 
soybean. 

8. The DNA encoding sequence of Claim 7 wherein said prenyltransferase protein is encoded by 
20 a sequence comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 19, SEQ 

ID NO:20, SEQ ID NO:21, SEQ ID NO:22, and SEQ ID NO:23. 

9. The DNA encoding sequence of Claim 7 wherein said prenyltransferase protein is encoded by 
a sequence selected from the group consisting of SEQ ID NO:95, and SEQ ID NO:96. 

10. The DNA encoding sequence of Claim 7 wherein said prenyltransferase protein has an 
25 amino acid sequence selected from the group consisting of SEQ ID NO:97, and SEQ ID NO:98. 

I I . The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from corn. 
12. The DNA encoding sequence of Claim 1 1 wherein said prenyltransferase protein is encoded 

by a sequence comprising a nucleotide sequence selected from the group consisting of SEQ ID NO:25, 
SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO:28, SEQ ID NO:29, SEQ IDNO:31, SEQ ID NO: 104, 
30 SEQ ID NO: 105, and SEQ ID NO: 106. 
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13. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from rice. 

14. The DNA encoding sequence of Claim 13 wherein said prenyltransferase protein is encoded 

by a sequence comprising SEQ ID NO:99. 

15. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from 

wheat. 

16. The DNA encoding sequence of Claim 15 wherein said prenyltransferase protein is encoded 

by a sequence comprising SEQ ID NO: 1 00. 

1 7. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from leek. 

1 8. The DNA encoding sequence of Claim 17 wherein said prenyltransferase protein is encoded 
by a sequence comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 101, 
and SEQ ID NO: 102. 

19. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from 

canola. 

20. The DNA encoding sequence of Claim 19 wherein said prenyltransferase protein is encoded 

- hy.a sequence comprising SEQ ID NO: 103. 

21 . The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from 

cotton. 

22. The DNA encoding sequence of Claim 21 wherein said prenyltransferase protein is encoded 
by a sequence comprising SEQ ID NO: 107. 

23. The DNA encoding sequence of Claim 4 wherein said prenyltransferase protein is from 

tomato. 

24. The DNA encoding sequence of Claim 23 wherein said prenyltransferase protein is encoded 
by a sequence comprising SEQ ID NO: 108. 

25. An isolated DNA sequence according to Claim 4, wherein said prokaryotic source is a 

Synechocystis sp. 

26. A nucleic acid construct comprising as operably linked components, a transcriptional - 
initiation region functional in a host cell, a nucleic acid sequence encoding a prenyltransferase, and a 
transcriptional termination region. 
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27. A nucleic acid construe* according to Claim 26, wherein said nucleic acid sequence encoding 
prcnyltransfcrasc is obtained from an organism selected from the group consisting ofa eukaryolic 
organism and a prokaryotic organism. 

28. A nucleic acid construct according to Claim 27, wherein said nucleic acid sequence encoding 
prenyltransfcrase is obtained from a plant source. 

29. A nucleic acid construct according to Claim 28, wherein said nucleic acid sequence encoding 
prenyltransferase is obtained from a source selected from the group consisting of Arabidopsis, soybean, 
corn, rice, wheat, leek canola, , leek, cotton, and tomato. 

30. A nucleic acid construct according to Claim 26, wherein said nucleic acid sequence encoding 
prenyltransferase is obtained from a Symchocystis sp. 

3 1 . A plant cell comprising the construct of 26. 

32. A plant comprising a cell of Claim 3 1 . 

m 33 A feed composition produced from a plant according to Claim 32. 
34. A seed comprising a cell of Claim 3 1 . 
35 Oil obtained from a seed of Claim 34. 

36. A natural tocopherol rich refined and deodorised oil which has been produced by a 
method of treating an oil according to Claim 35 by distilling under low pressure and high 
temperature, wherein said refined oil has reduced free fatty acids and a substantial percentage of 
tocopherol present in the pretreated oil. 

37. A refined oil according to claim 36, wherein the pretreated oil is crude or pre-treated 
soybean oil. 

38. A refined oil according to claim 36, wherein the refined oil is degummed and 
bleached. 

40. A method for the alteration of the isoprenoid content in a host cell, said method comprising; 
transforming said host cell with a construct comprising as operably linked components, a transcriptional 
initiation region functional in a host cell, a nucleic acid sequence encoding prenyltransferase, and a\ 
transcriptional termination region, 

wherein said isoprenoid compound selected from the group of tocopherols and tocotrienols . 

41 . The method according to Claim 40, wherein said host cell is selected from the group 
consisting ofa prokaryotic cell and a eukaryotic cell. 
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42. The method according to Claim 41 , wherein said prokaryotic cell is a Synechocystis sp. 

43. The method according to Claim 41 , wherein said eukaryolie cell is a plant cell. 

44. The method according to Claim 43, wherein said plant cell is obtained from a plant selected 
from the group consisting of Arabkiopsix, soybean, corn, rice, wheat, leek canola, , leek, cotton, and 
tomato. 

45. A method for producing an isoprcnoid compound of interest in a host cell, said method 
comprising obtaining a transformed host cell, said host cell having and expressing in its genome: 

a construct having a DNA sequence encoding a prenyltransferase operably linked to a 
transcriptional initiation region functional in a host cell, 

wherein said prenyltransferase is involved in the synthesis of tocopherols, 

and wherein said isoprenoid compound selected from the group of tocopherols and tocotrienols. 

46. The method according to Claim 45, wherein said host cell is selected from the group 

consisting of a prokaryotic cell and a eukaryotic cell. 
» 

47. The method according to Claim 46, wherein said prokaryotic cell is a Synechocystis sp. 

48. The method according to Claim 46, wherein said eukaryotic cell is a plant cell. 

49. The method according to Claim 48, wherein said plant cell is obtained from a plant selected 
from the group consisting wherein said compound selected from the group of ' Arabidopsis, soybean, 
corn, rice, wheat, leek canola, , leek, cotton, and tomato. 

50. A method for increasing the biosynthetic flux in a host cell toward production of an 
isoprenoid compound, said method comprising; 

transforming said host cell with a construct comprising as operably linked components, a 
transcriptional initiation region functional in a host cell, a DNA encoding a prenyltransferase, 
and a transcriptional termination region, 

wherein said isoprenoid compound selected from the group of tocopherols and 

tocotrienols,. j 

51. The method according to Claim 50, wherein said host cell is selected from the group 

consisting of a prokaryotic cell and a eukaryotic cell. 

52. The method according to Claim 51, wherein said prokaryotic cell is a Synechocystis sp. 

53. The method according to Claim 51, wherein said eukaryotic cell is a plant cell. 



WO 02/33060 PCT/US01/42673 



54. The method according to Claim 50, wherein said plant cell is obtained from a plant selected 
from the group consisting A rahithpsis % soybean, corn, rice, wheat, leek canola. , leek, cotton, and 
tomato. 

55. The method according to Claim 50, wherein said transcriptional initiation region is a secd- 
5 specific promoter. 
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FLD1A Ex=290. Em=336. TT (SB0MQP11 002-0201 .D) 
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28.911_ 

.28.589-01 
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Query Sequence: F4D11 AL022537 

Database: PIR T04448 .atcea . list . fasta 

Database: PIR~T04448 
Plus (+) denotes forward strand, and minus (-) reverse strand, 
Asterisks (*) denote bases not shown on pair wise alignmnts, 

Alignment 1 



Query- 
genomic 

ATCEA4C371+ 



12194 CACACGTTCTCGTCCTTTTCTTCTTCCTCTCTGCATTCTTCACAGAGTTTGTCACCACCA 



I 



C est 



MET 

Query- 
ATCEA4C371+ 



: first 



12134 





liil!!i|-|!lillll!l!i;ill!!i!l!i!ll!lllilllll!li!lilillMI 
2 ACCCCAAACATCACAATTTCACATTCTTTTGCATATTTCTTCTTCTTCTTCCATTATGGA 




Query- 
ATCEA4C371+ 



12075 GATACGGAGCTTGATTGTTTCT ATG AACCCT AATTT ATCTTCCTTTG AGCTCTCTCGCCC 

IIIIIIIMIIIIIIIIItllllllllllllllilllllllllllllllllJIIIIIIII 

62 GATACGGAGCTTGATTGTTTCTATGAACCCTAATTTATCTTCCTTTGAGCTCTCTCGCCC 



Query- 
ATCEA4C371+ 



12015 TGTATCTCCTCTCACTCGCTCACTAGTTCCGTTCCGATCGACTAAACTAGTTCCCCGCTC 

MllllllllllIllllllllllllllllllllllllllMIIINllllllllllllll 

122 TGTATCTCCTCTCACTCGCTCACTAGTTCCGTTCCGATCGACTAAACTAGTTCCCCGCTC 



Query- 
ATCEA4C371+ 



11955 CATTTCTAGGGTTTCGffiOffiATCTCCACCCCGAATAGTGAAACTGACAAGATCTCCGT 

lllllllllliillllHMIIIIIIlllllllllIlllllllllMlllliniHIII 

182 CATTTCTAGGGTTTCGGCGTCGATCTCCACCCCGAATAGTGAAACTGACAAGATCTCCGT 



Query- 

ATCEA4C371+ 

here 



11895 TAAACCTGTTTACGTCCCGACGTCTCCCMTCGCGAACTCCGGACT^^eAgTGJGTA 

lllllllllllllllllllllllllllllllllllllMllllllllllllllllll — • 

2 4 2 T AAACCTGTTTACGTCCCGACGTCTCCC AATCGCGAACTCCGG ACTg^^^S 

Synecho seq aligns from 



Query- 
ATCEA4C371+ 



11335 AATTGATCCATTCCATTCCATTTCTCTTCTCTTGTTTGTTTTATTAAGCTCCAATTTCAG 



299 



FIG. 31 
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~- 60 bp removed 

.»••«» 
Query- 11715 ****** + v ** H *** t ***************** Jtu ******* tt **** + ****** TTTG 

ATCEA4C371+ 299 
PIR:T04448 1 

»••»•»«»•»** 
Query- 1 1 655 GTGGCTCACCATTCGACGACTACTTTTGAATTTGAGTTTTTGAAAAATGCAATTTAACAT 

ATCEA4C371+ 299 

PIR:T04448 1 M Q F N I 

arab sequence which is incorrect 

• * * • • * 
**•**•••*••• 

Query- ■ 11595 CAGAGAGTTTTTTTTTTTATGGTTGATAACTTATTGTTTAACTTTTGAAAAATGCAfflEI 

----- Ill 

ATCEA4C371+ 299 HI 

PIR:T04448 6REFFFLWLITYCLTFEKC-RY 

• . » • • • 

Query- 11535 CCATTTCGATGGAACACCTCGGAAGTTCTTCGAGGGATGGTATTTCAGGGTTTCCATCCC 

lllMlllllllllllllllllllllllllliilllMrillllll llllll llllllll 
ATCEA4C371+ 302 CCATTTCGATGGAACACCTCGGAAGTTCTTCGAGGGATGGTATTTCffl^UTCCATCCC 

PIR:T04448 26HFDGTPRK'FFEGWYF SB S I P 

■ * » • • * 

*»*** ******* 

Query- 11475 AGAGAAGAGGGAGAGTTTTTGTTTTATGTATTCTGTGGAGAATCCTGCATTTCGGCAGAG 

!lill!itlfl!!l!!llllll!l!lll IIIIIIIIIIMIIIIIIIMJMIMIMII 
ATCEA4C37 1 + 362 AGAGAAGAGGGAGAGTTTTTGTTTTATGTATTCTGTGGAGAATCCTGCATTTCGGCAGAG 

PIR:T04448 46 EKRESFCFMYSVENPAFRQS 

• • * * • * 

Query- 11415 TTTGTCACCATTGGAAGTGGCTCTATATGGACCTAGATTCACTGGTGTTGGAGCTCAGAT 

|||(l!ili!!!lllllilltlllllll(UlllllHM)llilllil!lll!ll!lll 

ATCEA4C37 1+ 422 TTTGTCACCATTGGAAGTGGCTCTATATGGACCTAGATTCACTGGTGTTGGAGCTCAGAT 

,,,,,, ,,,,>•.!••*••*••*•*•*• ••«••«•*•••• ******************** 

PiR:T04448 68 LSPLEVALYGPRFTGVGAQI 
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Query- 
ATCEA4C371+ 



39/48 



11355 TCTTGGCGCTAATGATAAATATTTATGCC A ATACGAACAAGACTCTCACAATTTCTGGGG 

iiiiimimiiiimiiimiiimmiiiiiiiimiiiiiiiiii----- 

4 8 2 TCTTGGCGCTAATGATAAATATTTATGCCAATACGAAC AAGACTCTCAC AATTTC 



PIR:T04448 86LGANDKYLCQYEQDSHNFWG 
ATCEA4C371f Exon 11538 11301 Confidence: 100 100 



Query- 
ATCEA4C371+ 

PIR;T04448 
PIR:T04448 



11295 AGGTA ACTCCTTGACCCTT AAAATGCTGTGTCATGAC AATAAGAAATCATATCTGAGTCT 



537 



106 D 

Exon 



11609 11294 Confidence: 100 100 



Query- 
PIR:T04448 



11235 TTTCTCTACTTCTAGTACTAATGTTCGTTATTGTTGTTAAAGATCTAAGTCTTATCTGAA 



107 



Query- 
PIR:T04448 



11175 TTTTGTTACATTTTGGTTCTGGTGCTTTCTCAACATGAATTTGTATATATGACTTTAAAG 



107 



Query- 
PIR:T04448 



11115 ATTGCTTACCTAAAGTTTTTACTC ATGCATAGATCGACATGAGCTAGTTTTGGGGAATAC 
107 RH'ELVLGNT 



Query- 

PrR:T04448 
PIR:T04448 



11055 TTTT AGTGCTGTGCC AGGCGCAAAGGCTCCAAACAAGGAGGTTCC ACCAGAGGTTCTCAC 

116 F S A V P G A K A P N K t V P P E 
Exon 11083 11004 Confidence: 96 100 



Query- 
PIR:T04448 



10995 TCCTCCCTTGTTGGTTACTTTGTTATCTGTTAAATAGTTTTCCAATTGTATCCGGATAGT 



133 



Query- 



PIR:T04448 



10935 GTTCT ACTTCTCCTTGTAGAAAATCTCAAGTTTTTGTTACTCTTGCTATTCTCTTGG ATG 



133 
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Query- 
PIR:T04448 



10875 TTGATTTGTAAAGCATGTCGTTTTATTGTAGGAATTTAACAGAAGAGTGTCCGAAGGGTT 
133 EFNRRVSEGF 



Query- 

PIR:T04448 
PIR:T04448 



10815 CCAAGCT ACTCCATTTTGGCATCAAGGTC ACATTTGCG ATGATGGCCGGTAATTATATGA 

143QATPFWKQGHICDDGR 
Exon 10844 10768 Confidence: 100 100 



Query- 
PIR:T04448 



10755 TTCTATGCACAACAAGAATTCACTATATTATAAATATTGGATATTGAGTATTTTTGTTGA 



159 



Query- 
PIR:T04448 



1 0695 AAATTTCTGTGTTTAAATCTGACTTGACTTGTTTTGTCAGTACTGACTATGCGGAAACTG 



159 



• *t*****»*tt«*t«*+«t 

T D Y A E T V 



Query- 
PIR:T04448 



10635 TG AAATCTGCTCGTTGGGAGTAT AGTACTCGTCCCGTTT ACGGTTGGGGTGATGTTGGGG 
166 KSARWEYSTRPVYGWGDVGA 



Query- 
PIR:T04448 



10575 CCAAACAGAAGTC AACTGC AGGCTGGCCTGCAGCTTTTCCTGTATTTGAGCCTCATTGGC 
186 KQKSTAGWPAAFPVFEPHWQ 



Query- 

PIR:T04448 
PIR:T04448 



10515 AGATATGCATGGCAGGAGGCCTTTCCACAGGTGTGAGCTTTGCTTGATTGACTTAAAGTT 



206 rCMAGGLSTG 

Exon 10655 10436 Confidence: 96 100 



Query- 
PIR:T04443 



10455 AATAAATAGACGGTTAAGTTTACTTGCCTAGTACTAACAGAAAATTAAGAAAGAAACCAC 



216 
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Query- 
PIR:T04448 

Query- . 
PIR:T04448 



41/48 



10395 
216 



: . : 

CCTCTTTCTATCAGCAGAAACTGCTATTGTAGTTCTTATTTTTTCTCTTGTATTTGCA6G 



, . ; . ; 

10335 GTGGATAGMTGGGGCGGTGAAAGGTTTGAGTTTCGGGATGCACCTTCTTATTCAGAGAA 



216HIEWGGERFEFRDAPSYSEK 



Query- 10275 

PIR:T04448 
PIR:T04448 



GAATTGGGGTGGAGGCTTCCCAAGAAAATGGTTTTGGGTAAAACATTTCATCCTTTTGCT 



236 NWGGGFPRKWFW 

Exon 10336 10239 Confidence: 96 100 



Query- 10215 

PIR:T04448 248 

Query- 10155 

PIR:T04448 248 



ACATTTCTTGTTGCAGACTTTAGTTAGCTAGTGGACCTGTGTATACAGCCACATATAGTA 



TACTTGTTTGATAGCTTTATTTGTCAATGTCTCTTTACAGGTCCAGTGTAATGTCTTTGA 



******* 



V Q C N V F E 



Query- 
PIR:T04448 

Query- 

PIR:T04448 
PIR:T04448 

Query- 
PIR:T04448 284 

Query- 

P1R-.T04448 235 
GSDB:S:495- 532 



: • : • : • : • : ' ; 
10095 AGGGGCMCTGGAGAAGTTGCTTTAACCGCAGGTGGCGGGTTGAGGCAATTGCCTGGATT 

: '• : : '• : ;:: : ;'•'••*• : ; :: : ; : 
255 GATGEVALTAGGGLRQLPGL 

10035 GACTGAGACCTATGAAAATGCTGCACTGGTATGCACTTATAAGATCTTCTTAAGCAATGA 



. . • ■ • t * * 



( » » i » » » » » • 



275 TETYENAAL 

Exon 10115 10008 Confidence: 100 100 

. : • : • : • : • : • ; 

9975 CAGTGAGTATTAGAAGGCAGATAGTTTACAAAAGCTCTGGGCCCTTGTAAATCTGCAGGT 



V 



9915 TTGTGTACACTATGATGGAAAAATGTACGAGTTTGTTCCTTGGAATGGTGTTGTTAGATG 



♦ # * * * • * 



,* 



******* * 



,((••>**•*• 



HYOGKMYEFVPWNGVVRW 

- - tlllll 

tagatg 
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Query- 

PIR:T04448 

GSDB:S:495- 

PIR:T04448 

GSDB:S:495- 

Query- 

PIR:T04448- 

GSDB:S:495- 

Query- 

PIR:T04448 

GSDB:S:495- 

Query- 

PIR:T04448 

GSDB;S:495- 

Query- 

PIR:T04448 

GSDB:S:495- 

Query- 

PIR:T04448 

GSDB;S:495- 

PIR:T04448 

GSDB:S:495- 



9855 GGAAATGTCTCCCTGGGG TTATTGGTATATAACTGCAGAGAACGAAAACCATGTGGTAA 

::::::::::::::::::::::::::::::::::::;-— 
305 EMSPWG YWYITA'ENENHV 

Hllll-lllttllllll-llltitillllllllilltllll tl IIIIIIIIH-— 
526 ggaaat tctccctgggggttattggtatataactgcagagaNcgNaaaccatgtg 
Exon 9917 9801 Confidence; 100 100 
Exon 9961 9801 Confidence: 93 93 

9796 ATTTGTTTTACTAGTTTCATTCAGTTTTACTTTTGACATCATATCATTCCCTTATGGCTA 
323 

471 

9736 GATTCCAACACCCGATGMTGTCTTGTGACAGGTGGAACTAGAGGCAAGMCAAATGAAG 
323 VELEARTNEA 

- : iiiiiliiiinii niiiniMii! 

471 ■ gtggaactagaggcNagaacaaatgaag 

: . : 

9676 CGGGTACACCTCTGCGTGCTCCTACCACAGAAGTTGGGCTAGCTACGGCTTGCAGAGATA ■ 

I::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 
333 GTPLRAPTTEVGLATACRDS 

iltllllllMMIIIIIMIIIIIIIIIIIirilllllMIIIIIIIIIIIIIMIIII 
443 cgggtacacctctgcgtgctcctaccacagaagttgggctagctacggcttgcagagata 

* . i * • * * 

9616 GTTGTTACGGTGAATTGAAGTTGCAGATATGGGAACGGCTATATGATGGAAGTAAAGGCA 
J:::::::::::::::::::;::::::::::::::::::::::::::::::::;:::::: 

353 CYGELKLQIWERLYDGSKGK 

IMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIillllllll 
383 gttgttacggtgaattgaagttgcagatatgggaacggctatatgatggaagtaaaggca 

* * • • • • 

9556 AGGTATGTATGCTAATGTGATCCAATCCCTGTAGTTAAAAGTCTTAACAAATCCTAAGGC 

373 LKVLTNPKA 
II 

323 ag 

Exon 9704 9555 Confidence: 100 100 
Exon 9704 9555 Confidence: 98 100 

FIG. 31 (CONT-5) 
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Query- 

PIR:T04448 

GSDB:S:495- 

Query- 

PIR:T04443 

GSDB:S:495- 

Quecy- 

PIR:T04448 

GSDB:S:495- 

Query- 
(stop) 

PIR:T04448 

GSDB:S:495- 
PIR:T04448 

Query- 

PIR:T04448 

GSOB:S:495 

Query- 

GSD8:.S:495 
u 

Query- 

GSDB:S:495 
GSDB:S:495 



43/48 • 

tt**t* ****** 

9496 AGTGAAAGAAG ATTATGAACGTTTGTTATGGTT AACAATG ATGCAGGTGAT ATTAGAG AC 
382 VKEDYERLLWLT'MMQVILET 
321 gtgatattagagac 

• * • • » * 

9436 AAAGAGCTCAATGGCAGCAGTGGAGATAGGAGGAGGACCGTGGTTTGGGACATGGAAAGG 
402 KSSMAAVE I GGGPW FGTWKG 
307 aaagagctcaatqqcaMcaqtggagataggaggaggaccgtggtttgggacatggaaagg 

9376 AGATACGAGCAACACGCCCGAGCTACTAAAACAGGCTCTTCAGGTCCCATTGGATCTTGA 
422 DTSNTPELLKQALQVPLDLE 
247 agatacgagcaacacgcccgagctactaaaacaggctcttcaggtcccattggatcttga 



9316 AAGCGCCTTAGGTTTGGTCCCTTTCTTCAAGCCACCGGGTCTGM^^^^^ 



442 SALGLVPFFKPPGL 



187 aagcgccttaggtttggtccctttcttcaagccaccgggtctgtaacattgatgagtgtt 
Exon 9522 ' 9274 Confidence: 100 100 



9256 iKSffif^ 
456 



127 tkgtttgttgatagagacccatgtgatgaatgaagccttagtcatgtcattgctagcttc 

» » * » • • 

tit* 

9196 ACTATTATGTATGTATGATTTTAGTTCGTTCGGTCCTTGTGGTAAATGATACGGGCCAGT 

67 actattatgtatgtatgattttagttcgttcggtccttgtggtaaatgatacgggccagt 

• . • * # * 

• * • * * * * * ■ ^ * * ■ 

9136 GTAAAGTCTAGTTCAATAAAAGCCTTG AGTCGCATAATTTC AATTTCAAATTGC ATC 
7 gtaaagt 

Exon 9450 9130 Confidence: 98 100 
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ATCEA4C37145 1 3063693/emb|CAA18584.l| 4.0e-43 (AL022537) putative protein 
lArabidopsis~thaliana] 

PIR:T04448 sPIR-T04448 shypothetical protein F4D11.30 - Arabidopsis thaliana; 
g 3063693|emb|CAAl8584.1 (AL022537) putative protein (Arabidopsis thalianaJJWll.iO 

GSDB'S-4955486|AI995392|AI995392|701673779 A. thaliana, Columbia Col-O, 
inflorescence-1 Arabidopsis thaliana cDNA clone 701673779, raRNA sequence. 
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LU 

o 
o 
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LU 
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slrl737jYNSP S74814_ 
slrl737JVRATHTQ4448_ 
CFI_ARATH_P41088_ 

slrl737 SYNSPJ74814_ 
slrl737~ARATH T04448 
CFI - ARATHJ41088 - 

slrl737_SYNSPJ74814_ 
s lrl737_ARATHjr04448_ 
CFI JVRATH_P41088_ > 

slrl737JYNSP_S74814_ 
slrl737_ARATH_T04448_ 
CFT_ARATH_P41088_ 

slrl737JYNSP_S74814_ 
slrl737_ARATH_T04448_ 
CFI_ARATH_P41088_ 

slr!737JYNSP_S74814 

slrl737_ARATH_T04448" 

CFIJIRATHP41088_ 

Slrl737 SYNSP S74814 

slrl737"ARATH"T04448" 

CFI_ARATH_P41088_ 

slr!737 SYNSPJ74814 

slrl737~ARATH_T04448_ 

CFIJVRATH_P41088_ 

slrl737 SYNSP S74814 
slrl737~ARATH~T04448" 
CFI ARATH_P41088_ 

slrl737_SYNSP S74814 
slrl737_ARATHJ04448' 
CFI ARATH P41088 



: M 

MEIRSLIVSMNPNLSSFELSRPVSPLTRSLVPFRSTKLVPRSISRVSASI 

-T " 

KF p- PHSGYHWQGQ&PFFEGWYVRLL 

STPNSETDKISVKPVYVPTSPNRELRTPHSGYHFDGTPRKFFBGWYFRVS 

LPQSGESFAFMYSIENPASDHHYGGGAVQILGPATK KQENQEDQLV 

IPEKRESFCFMYSVENPAFRQSLSPLEVALYGPRFTGVGAQILGANDKYL 

MSSSNACAS PSPFPA VTKLHVDSV- 

WRTFPSVKKFWASPRQFALG-HWGKCRDNRQ-AKPLLSEEFFATVKEGYQ 
CQYEQDSHNFWGDRHELVLGNTFSAVPGAKAPNKEVPPEEFNRRVSEGFQ 
— TFVPSVKSPASSNPLFLG-GAGVRGLDIQ-GK FVIFTVIGVY 

IHQNQHQGQIIHGDR HCRWQFTVEPEVTWGSPNRFPRATAGW 

ATPFWHQGHICDDGRTDYAETVKSARWEYSTRPVYGWGDVGAKQKSTAGW 
LEGNAVPSLSV KWKGKTTEELTESIPFFREIVTGAF 

■ 

LSFLPLFDPGWQILLAQGRAHGWLKWQREQYEFDHALVYAEKNWGHSFPS 
PAAFPVFEPHWQICMAGGLSTGWIEWGGERFEFRDAPSYSEKNWGGGFPR 
EKFIKVT M KLPLTGQQYSEKVTENC 

RWFWLQANYFPDHPG- LS VTAAGGERI VLGRPE — EVALIGLHHQGNFY 
KWFWVQCNVFEGATGEVALTAGGGLRQLPGLTETYENAALVCVHYDGKMY 

VAIWKQLGLYTDCEA-KAV-- — EKFLEIFKE — ET — — 

EFGPGHGTVTWQVAPWGRWQLKASNDRYWVKLSGKTDKKGSLVHTP-TAQ 
EFVPWNGWRWEMSPWGYWYITAENENHWELEARTNEAGTPLRAPTTEV 
-FPPG-SSILFALSPTGSLTVAFSKDDS-IPETGIAVIENKLLAEA-VLE 

GLQLNCRDTTRGYLYLQLGSVGHG LIVQGETDTAGLEVGG 

GLATACRDSCYGELKLQIWERLYDGSKGSVILETKSSMAAVEIGGGPWFG 

— SIIGKNGVSPGTRLSVAERLSQ LMMKNKDEKEVSDHSL 

- — DWGLTEENLSKKT -—VPF 

TWKGDTSNTPELLKQALQVPLDLESALGLVPFFKPPGL 

EEKLAKEN ' 



FIG. 35 

SUBSTITUTE SHEET (RULE 26) 



WO 02/33060 



PCT/US01/42673 



SEQUENCE LISTING 

<110> Lassner, Michael 

Post-Beittenmiller , Martha 
Savidge, Beth 
Weiss, James 

<120> Nucleic Acicl Sequences Involved in 
Tocopherol Synthesis 

<130> 17133/00/WO 

<150> 60/129,899 
<151> 1999-04-15 

<150> 60/146,461 
<151> 1999-07-30 

<150> PCT/US00/10368 
<151> 2000-04-14 

<160> 94 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 1182 
<212> DNA 

<213> Arabidopsis sp 
<400> 1 

atggagtctc tgctctctag ttcttctctt gtttccgctg ctggtgggtt ttgttggaag 
60 

aagcagaatc taaagctcca ctctttatca gaaatccgag ttctgcgttg tgattcgagt 
120 

aaagttgtcg caaaaccgaa gtttaggaac aatcttgtta ggcctgatgg tcaaggatct 
180 

tcattgfctgt tgtatccaaa acataagtcg agatttcggg ttaatgccac tgcgggtcag 
240 

cctgaggctt tcgactcgaa tagcaaacag aagtctttta gagactcgtt agatgcgttt 
300 

tacaggtttt ctaggcctca tacagttatt ggcacagtgc ttagcatttt atchgtatct 
360 

fctcttagcag tagagaaggt ttctgatata tctccttbac ttttcactgg catcfcfcggag 
420 

gctgttgttg cagctctcat gatgaacatt tacatagttg ggctaaatca gttgtctgat 
480 

gttgaaatag ataaggttaa caagccctat cttccattgg catcaggaga afeattctgtt 
540 

aacaccggca ttgcaatagt agcttccttc tccatcatga gtttctggct tgggtggatt 
600 

gttggttcafc ggccattgtt ctgggctctt tttgtgagtt tcatgctcgg tactgcafcac 
660 

fcctatcaatt tgccactttt acggtggaaa agatttgcat tggttgcagc aatgtgtatc 
720 

ctcgctgtcc gagctattat tgttcaaatc gccttttatc tacatatfcca gacacatgtg 
780 

tttggaagac caafccttgfct cactaggcct cttattfctcg ccactgcgtt tatgagcttt 

840 uu 
ttctctgtcg ttattgcatt gtttaaggat atacctgata tcgaagggga taagatattc 



1/44 



WO 02/33060 PCT7US01/42673 



900 L , 

ggaatccgat catfcctctgt aactctgggt cagaaacggg tgttttggac atgtgttaca 

960 

ctacttcaaa tggcttacgc tgttgcaatt ctagttggag ccacatctcc attcatafcgg 

1020 . . 

agcaaagtca tctcggttgt gggtcatgtt atactcgcaa caactttgtg ggctcgagct 

1080 

aaghccghtg atctgagtag caaaaccgaa ataacttcat gttatatgtt catatggaag 
1140 

ctcttttatg cagagtactt gctgttacct tttttgaagt ga 
1182 

<210> 2 
<211> 393 
<212> PRT 

<213> Arabidopsis sp 

<400> 2 , ^ 

Met Glu Ser Leu Leu Ser Ser Ser Ser Leu Val Ser Ala Ala Gly Gly 

15 10 15 

Phe Cys Trp Lys Lys Gin Asn Leu Lys Leu His Ser Leu Ser Glu lie 

20 25 30 

Arg Val Leu Arg Cys Asp Ser Ser Lys Val Val Ala Lys Pro Lys Phe 

35 40 45 

Arg Asn Asn Leu Val Arg Pro Asp Gly Gin Gly Ser Ser Leu Leu Leu 

50 55 60 

Tvr Pro Lys His Lys Ser Arg Phe Arg Val Asn Ala Thr Ala Gly Gin 
65 70 75 80 

Pro Glu Ala Phe Asp Ser Asn Ser Lys Gin Lys Ser Phe Arg Asp Ser 

85 90 95 

Leu Asp Ala Phe Tyr Arg Phe Ser Arg Pro His Thr Val lie Gly Thr 

100 105 HO 

Val Leu Ser lie Leu Ser Val Ser Phe Leu Ala Val Glu Lys Val Ser 

115 120 125 

Asp He Ser Pro Leu Leu Phe Thr Gly He Leu Glu Ala Val Val Ala 

130 135 140 

Ala Leu Met Met Asn lie Tyr He Val Gly Leu Asn Gin Leu Ser Asp 
145 150 155 160 

Val Glu He Asp Lys Val Asn Lys Pro Tyr Leu Pro Leu Ala Ser Gly 

165 170 175 

Glu Tyr Ser Val Asn Thr Gly He Ala He Val Ala Ser Phe Ser He 

180 185 190 

Met Ser Phe Trp Leu Gly Trp He Val Gly Ser Trp Pro Leu Phe Trp 

195 200 205 

Ala Leu Phe Val Ser Phe Met Leu Gly Thr Ala Tyr Ser He Asn Leu 

210 215 220 

Pro Leu Leu Arg Trp Lys Arg Phe Ala Leu Val Ala Ala Met Cys He 
225 230 235 240 

Leu Ala Val Arg Ala He He Val Gin He Ala Phe Tyr Leu His He 

245 250 255 

Gin Thr His Val Phe Gly Arg Pro He Leu Phe Thr Arg Pro Leu He 

260 265 270 

Phe Ala Thr Ala Phe Met Ser Phe Phe Ser Val Val He Ala Leu Phe 

275 280 285 

Lys Asp He Pro Asp He Glu Gly Asp Lys He Phe Gly He Arg Ser 

290 295 300 

Phe Ser Val Thr Leu Gly Gin Lys Arg Val Phe Trp Thr Cys Val Thr 
305 310 315 320 

Leu Leu Gin Met Ala Tyr Ala Val Ala lie Leu Val Gly Ala Thr Ser 

325 330 335 

Pro Phe He Trp Ser Lys Val He Ser Val Val Gly His Val He Leu 
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340 345 350 

Ala Thr Thr Leu Trp Ala Arg Ala Lys Ser Val Asp Leu Ser Ser Lys 

355 360 365 

Thr Glu lie Thr Ser Cys Tyr Met Phe He Trp Lys Leu Phe Tyr Ala 

370 375 380 

Glu Tyr Leu Leu Leu Pro Phe Leu Lys 
385 390 

<210> 3 
<211> 1224 
<212> DMA 

<213> Arabidopsis sp 
<400> 3 

abggcgbbbb bbgggcbcbc ccgtgtttca agacggbtgt bgaaabcbbc cgtctccgta 
60 

acbccabcbb cttcctctgc bcbbbbgcaa tcacaacata aabccbbgtc caatcctgtg 
120 

actacccatt acacaaatcc tttcactaag tgttatcctt catggaatga taattaccaa 
180 

gbabggagba aaggaagaga attgcatcag gagaagbbbb ttggtgttgg bbggaabbac 
240 

agattaattt gbggaabgbc gtcgtcttct tcggttttgg agggaaagcc gaagaaagat 
300 

gabaaggaga agagbgabgg tgttgttgtt aagaaagctt cttggataga tttgtattta 
360 

ccagaagaag ttagaggtta tgcbaagcbb gctcgattgg ataaacccat tggaacttgg 
420 • ■ 

bbgcbbgcgb ggccttgtat gtggtcgatt gcgttggctg ctgatcctgg aagccttcca 
480 

agbbbtaaab atabggcbbb atttggttgc ggagcattac bbcbbagagg bgcbggbbgb 
540 

actataaatg atctgcttga tcaggacata gatacaaagg ttgatcgtac aaaactaaga 
600 

cctatcgcca gbggbctbbb gacaccattt caagggattg gatttctcgg gctgcagttg 
660 

cttttaggct bagggattct tcbccaacbb aacaattaca gccgtgtttt aggggcttca 
720 

tctttgttac bbgbcbttbc ctacccactt atgaagaggt ttacattttg gcctcaagcc 
780 

tttttaggtt tgaccataaa ctggggagca ttgttaggat ggactgcagt taaaggaagc 
840 

atagcaccat ctafctgtact cccfcctctafc ctctccggag tcfcgctggac ccttgtttat 
900 

gatactattfc atgcacatca ggacaaagaa gatgatgtaa aagttggtgt taagtcaaca 
960 

gcccttagat tcggtgafcaa tacaaagcth tggttaactg gatttggcac agcatccata 
1020 

ggttttcttg cactttctgg attcagtgca gatctcgggt ggcaatafcta cgcatcactg 
1080 

gccgctgcat caggacagtt aggatggcaa atagggacag ctgacbtatc atcbggbgcb 
1140 

gactgcagba gaaaabtbgb gbcgaacaag bggbbtggbg cbabtababb bagbggagbb 
1200 

gbacbbggaa gaagbbbbca abaa 
1224 

<210> 4 
<211> 407 
<212> PRT 

<213> Arabidopsis sp 
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<400> 4 

Met Ala Phe Phe Gly Leu Ser Arg Val Ser Arg Arg Leu Leu Lys Ser 

I 5 10 15 

Sex Val Ser Val Thr Pro Ser Ser Ser Ser Ala Leu Leu Gin Ser Gin 

20 25 30 

His Lys Ser Leu Ser Asn Pro Val Thr Thr His Tyr Thr Asn Pro Phe 

35 40 45 

Thr Lys Cys Tyr Pro Ser Trp Asn Asp Asn Tyr Gin Val Trp Ser Lys 

50 55 60 

Gly Arg Glu Leu His Gin Glu Lys Phe Phe Gly Val Gly Trp Asn Tyr 
65 70 75 80 

Arq Leu He Cys Gly Met Ser Ser Ser Ser Ser Val Leu Glu Gly Lys 

85 90 95 

Pro Lys Lys Asp Asp Lys Glu Lys Ser Asp Gly Val Val Val Lys Lys 

100 105 HO 

Ala Ser Trp He Asp Leu Tyr Leu Pro Glu Glu Val Arg Gly Tyr Ala 

115 120 125 

Lys Leu Ala Arg Leu Asp Lys Pro He Gly Thr Trp Leu Leu Ala Trp 

130 135 140 

Pro Cvs Met Tro Ser He Ala Leu Ala Ala Asp Pro Gly Ser Leu Pro 
145 ^ 150 155 160 

Ser Phe Lys Tyr Met Ala Leu Phe Gly Cys Gly Ala Leu Leu Leu Arg 

165 170 175 

Glv Ala Gly Cys Thr He Asn Asp Leu Leu Asp Gin Asp lie Asp Thr 

180 185 190 

Lys Val Asp Arg Thr Lys Leu Arg Pro He Ala Ser Gly Leu Leu Thr 

195 200 205 

Pro Phe Gin Gly He Gly Phe Leu Gly Leu Gin Leu Leu Leu Gly Leu 

210 215 220 

Gly He Leu Leu Gin Leu Asn Asn Tyr Ser Arg Val Leu Gly Ala Ser 
225 230 235 240 

Ser Leu Leu Leu Val Phe Ser Tyr Pro Leu Met Lys Arg Phe Thr Phe 

245 250 255 

Tro Pro Gin Ala Phe Leu Gly Leu Thr He Asn Trp Gly Ala Leu Leu 

260 265 270 

Gly Trp Thr Ala Val Lys Gly Ser He Ala Pro Ser lie Val Leu Pro 

275 280 285 

Leu Tyr Leu Ser Gly Val Cys Trp Thr Leu Val Tyr Asp Thr He Tyr 

290 295 300 

Ala His Gin Asp Lys Glu Asp Asp Val Lys Val Gly Val Lys Ser Thr 
305 310 315 320 

Ala Leu Arg Phe Gly Asp Asn Thr Lys Leu Trp Leu Thr Gly Phe Gly 

325 330 335 

Thr Ala Ser He Gly Phe Leu Ala Leu Ser Gly Phe Ser Ala Asp Leu 

340 345 350 

Glv Trp Gin Tyr Tyr Ala Ser Leu Ala Ala Ala Ser Gly Gin Leu Gly 

355 360 365 

Trp Gin He Gly Thr Ala Asp Leu Ser Ser Gly Ala Asp Cys Ser Arg 

370 375 380 

Lys Phe Val Ser Asn Lys Trp Phe Gly Ala He He Phe Ser Gly Val 
385 390 395 400 

Val Leu Gly Arg Ser Phe Gin 

405 

<210> 5 
<211> 1296 
<212> DNA 

<213> Arabidopsis sp 
<400> 5 
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atgtggcgaa gatcfcgttgb btctcgbbba fccbbcaagaa tctctgtbtc ttcttcgbfca 
60 

ccaaacccta gacbgattcc bbggbcccgc gaattatgbg ccgtbaabag cttcbcccag 
120 

ccbccggbcb cgacggaabc aacbgcbaag tbagggabca cbggtgtbag abctgabgcc 
180 

aabcgagbbb btgccacbgc bacbgccgcc gctacagcfca cagcbaccac cggbgagabb 
240 

fccgtctagag ttgcggcbbb ggcbggabta gggcabcacb acgcbcgttg btabtgggag 
300 

ctttctaaag ctaaactfcag tabgcbbgbg gttgcaactb cbggaacbgg gbatattctg 
360 

ggbacgggaa atgcfcgcaab fcagcbbcccg gggcbtbgbb acacabgtgc aggaaccatg 
420 

atgattgctg cabctgcbaa ttccbtgaat cagattbttg agabaagcaa bgabtctaag 
480 

atgaaaagaa cgatgctaag gccabbgccb bcaggacgta tbagtgbbcc acacgctgtb 
540 

gcatgggcta ctatfcgcbgg tgctbctggt gcttgtbtgt tggccagcaa gactaatatg 
600 

ttggcbgctg gacttgcatc bgccaatctt gtactbfcatg cgtbbgttta tactccgttg 
660 

aagcaacttc accctatcaa bacatgggtt ggcgctgttg btggtgctat cccacccbtg 
720 

ctbgggtggg cggcagcgbc fcggtcagabt bcabacaatt cgabgatbct tccagcbgct 
780 

ctbtacbttt ggcagatacc tcatbhtatg gccctbgcac atcbcbgccg caatgattat 
840 

gcagcbggag gttacaagat gtbgtcactc tbbgabccgt cagggaagag aatagcagca 
900 

gtggctctaa ggaactgctt tbacabgabc cctctcggtb bcabcgccta bgactggggg 
960 

ttaacctcaa gttggttttg ccbcgaatca acacttctca cacbagcaafc cgctgcaaca 
1020 

gcattttcat tctaccgaga ccggaccabg cataaagcaa ggaaaatgtt ccatgccagt 
1080 

cttctcttcc ttcctgtttt catgfccbggt ctbcttctac accgtgbctc baatgataat 
1140 

cagcaacaac tcgbagaaga agccggatta acaaattctg fcabctggbga agtcaaaact 
1200 

cagaggcgaa agaaacgtgt ggctcaaccb ccggtggctt atgcctctgc tgcaccgbtt 
1260 

cctttcctcc cagctccttc ctbcbactcb ccatga 
1296 

<210> 6 
<211> 431 
<212> PRT 

<213> Arabidopsis sp 
<4 00> 6 

Meb Trp Arg Arg Ser Val Val Tyr Arg Phe Ser Ser Arg lie Ser Val 

15 10 15 

Ser Ser Ser Leu Pro Asn Pro Arg Leu lie Pro Tro Ser Arg Glu Leu 

20 25 " 30 

Cys Ala Val Asn Ser Phe Ser Gin Pro Pro Val Ser Thr Glu Ser Thr 

35 40 45 

Ala Lys Leu Gly lie Thr Gly Val Arg Ser Asp Ala Asn Arg Val Phe 

50 55 60 

Ala Thr Ala Thr Ala Ala Ala Thr Ala Thr Ala Thr Thr Gly Glu lie 
65 70 75 80 
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Ser Ser Arg Val Ala Ala Leu Ala Gly Leu Gly His His Tyr Ala Arg 

85 90 95 

Cys Tyr Trp Glu Leu Ser Lys Ala Lys Leu Ser Met Leu Val Val Ala 

100 105 HO 

Thr Ser Gly Thr Gly Tyr lie Leu Gly Thr Gly Asn Ala Ala lie Ser 

115 120 125 

Phe Pro Gly Leu Cys Tyr Thr Cys Ala Gly Thr Met Met He Ala Ala 

130 135 140 

Ser Ala Asn Ser Leu Asn Gin He Phe Glu lie Ser Asn Asp Ser Lys 
145 150 155 160 

Met Lys Arg Thr Met Leu Arg Pro Leu Pro Ser Gly Arg He Ser Val 

165 170 175 

Pro His Ala Val Ala Trp Ala Thr He Ala Gly Ala Ser Gly Ala Cys 

180 185 190 

Leu Leu Ala Ser Lys Thr Asn Met Leu Ala Ala Gly Leu Ala Ser Ala 

195 200 205 

Asn Leu Val Leu Tyr Ala Phe Val Tyr Thr Pro Leu Lys Gin Leu His 

210 215 220 

Pro He Asn Thr Trp Val Gly Ala Val Val Gly Ala He Pro Pro Leu 
225 230 235 240 

Leu Gly Trp Ala Ala Ala Ser Gly Gin He Ser Tyr Asn Ser Met He 

245 250 255 

Leu Pro Ala Ala Leu Tyr Phe Trp Gin He Pro His Phe Met Ala Leu 

260 265 270 

Ala His Leu Cys Arg Asn Asp Tyr Ala Ala Gly Gly Tyr Lys Met Leu 

275 280 285 

Ser Leu Phe Asp Pro Ser Gly Lys Arg He Ala Ala Val Ala Leu Arg 

290 295 300 

Asn Cys Phe Tyr Met lie Pro Leu Gly Phe He Ala Tyr Asp Trp Gly 
305 310 315 320 

Leu Thr Ser Ser Trp Phe Cys Leu Glu Ser Thr Leu Leu Thr Leu Ala 

325 330 335 

He Ala Ala Thr Ala Phe Ser Phe Tyr Arg Asp Arg Thr Met His Lys 

340 345 350 

Ala Arg Lys Met Phe His Ala Ser Leu Leu Phe Leu Pro Val Phe Met 

355 360 365 

Ser Gly Leu Leu Leu His Arg Val Ser Asn Asp Asn Gin Gin Gin Leu 

370 375 380 

Val Glu Glu Ala Gly Leu Thr Asn Ser Val Ser Gly Glu Val Lys Thr 
385 390 395 400 

Gin Arg Arg Lys Lys Arg Val Ala Gin Pro Pro Val Ala Tyr Ala Ser 

405 410 415 

Ala Ala Pro Phe Pro Phe Leu Pro Ala Pro Ser Phe Tyr Ser Pro 

420 425 430 

<210> 7 
<211> 479 
<212> DMA 

<213> Arabidopsis sp 
<400> 7 

ggaaactccc ggagcacctg tttgcaggta ccgctaacct taatcgataa tttatttctc 
60 

ttgtcaggaa ttatgtaagt ctggtggaag gctcgcatac catttttgca ttgcctttcg 
120 

ctatgatcgg gtttactttg ggtgtgatga gaccaggcgt ggctttatgg tatggcgaaa 

180 t u 

acccattttt atccaatgct gcattccctc ccgatgattc gttctttcat tcctatacag 

240 

gtatcatgct gataaaactg ttactggtac tggtttgtat ggtatcagca agaagcgcgg 
300 
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cgatggcgtt taaccggtat ctcgacaggc attttgacgc gaagaacccg cgtactgcca 
3 60 

tccgtgaaat acctgcgggc gtcatatctg ccaacagtgc gctggtgtfct acgataggct 
420 

gctgcgtggt attctgggtg gcctgttatt tcattaacac gatctgfcfcfct tacctggcg 
479 

<210> 8 
<211> 551 
<212> DNA 

<213> Arabidopsis sp 
<220> 

<221> misc_feature 
<222> (1) . . . (551) 
<223> n - A, T, C or G 

<400> 8 

ttgtggctta caccttaatg agcatacgcc agnccattac ggctcgttaa tcggcgccat 
60 

ngccggngct gntgcaccgg tagtgggcta ctgcgccgtg accaatcagc ttgatcfcagc 
120 

ggctcttatt ctgtttttaa ttttactgtt ctggcaaatg ccgcattttt acgcgatttc 
180 

catttfccagg ctaaaagact tttcagcggc ctgtattccg gtgcfcgccca tcattaaaga 
240 

cctgcgctat accaaaatca gcatgctggt ttacgtgggc ttatttacac tggctgctat 
300 

catgccggcc ctcttagggt atgccggttg gatttatggg atagcggcct taatfcbtagg 
360 

cttgtattgg cfcttatatbg ccatacaagg attcaagacc gccgatgatc aaaaatggtc 
420 

tcgfcaagatg tttggatctt cgattfctaat cattaccctc ttgtcggtaa tgatgcttgt 
480 

ttaaacttac tgccfccctga agtttatata tcgataattt cagcttaagg aggcttagtg 
540 

gttaattcaa t 
551 

<210> 9 
<211> 297 
<212> PRT 

<213> Arabidopsis sp 
<400> 9 

Met Val Leu Ala Glu Val Pro Lys Leu Ala Ser Ala Ala Glu Tyr Phe 

15 10 15 

Phe Lys Arg Gly Val Gin Gly Lys Gin Phe Arg Ser Thr He Leu Leu 

20 25 30 

Leu Met Ala Thr Ala Leu Asn Val Arg Val Pro Glu Ala Leu He Gly 

35 40 45 

Glu Ser Thr Asp He Val Thr Ser Glu Leu Arg Val Arg Gin Arg Gly 

50 55 60 

He Ala Glu He Thr Glu Met He His Val Ala Ser Leu Leu His Asp 
65 70 75 80 

Asp Val Leu Asp Asp Ala Asp Thr Arg Arg Gly Val Gly Ser Leu Asn 

85 90 95 

Val Val Met Gly Asn Lys Val Val Ala Leu Leu Ala Thr Ala Val Glu 

100 ' 105 HO 

His Leu Val Thr Gly Glu Thr Met Glu He Thr Ser Ser Thr Glu Gin 
115 120 . 125 
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Arg 


Tyr 
130 


Ser 


Met 


Asp 


Tyr 


Tyr 
135 


Met 


Gin 


Lys 


Thr 


Tyr 
140 


Tyr 


Lys 


Thr 


Ala 


Ser 


Leu 


He 


Ser 


Asn 


Ser 


Cys 


Lys 


Ala 


Val 


Ala 


Val 


Leu 


Thr 


Gly 


Gin 


145 










150 










155 








160 


Thr 


Ala 


Glu 


Val 


Ala 


Val 


Leu 


Ala 


Phe 


Glu 


Tyr 


Gly 


Arg 


Asn 


Leu 


Gly 










165 










170 










175 


Leu 


Ala 


Phe 


Gin 
180 


Leu 


He 


Asp 


Asp 


He 
185 


Leu 


Asp 


Phe 


Thr 


Gly 
190 


Thr 


Ser 


Ala 


Ser 


Leu 
195 


Gly 


Lys 


Gly 


Ser 


Leu 
200 


Ser 


Asp 


He 


Arg 


His 
205 


Gly 


Val 


He 


Thr 


Ala 


Pro 


lie 


Leu 


Phe 


Ala 


Met 


Glu 


Glu 


Phe 


Pro 


Gin 


Leu 


Arg 


Glu 




210 










215 










220 








Val 


Val 


Asp 


Gin 


Val 


Glu 


Lys 


Asp 


Pro 


Arg 


Asn 


Val 


Asp 


He 


Ala 


Leu 


225 










230 










235 










240 


Glu 


Tyr 


Leu 


Gly 


Lys 
245 


Ser 


Lys 


Gly 


He 


Gin 
250 


Arg 


Ala 


Arg 


Glu 


Leu 
255 


Ala 


Met 


Glu 


His 


Ala 


Asn 


Leu 


Ala 


Ala 


Ala 


Ala 


He 


Gly 


Ser 


Leu 


Pro 


Glu 








260 










265 








270 






Thr Asp 


Asn 


Glu 


Asp 


Val 


Lys 


Arg 


Ser 


Arg 


Arg 


Ala 


Leu 


He 


Asp 


Leu 






275 










280 










285 








Thr 


His 
290 


Arg 


Val 


He 


Thr 


Arg 
295 


Asn 


Lys 

















<210> 10 
<211> 561 
<212> DMA 

<213> Arabidopsis sp 
<400> 10 

aagcgcatcc gtcctcttct acgattgccg ccagccgcat gtatggctgc ataaccgacc 
60 

gcccctatcc gctcgcggcc gcggtcgaat tcattcacac cgcgacgctg ctgcatgacg 
120 

acgtcgtcga tgaaagcgat ttgcgccgcg gccgcgaaag cgcgcataag gttttcggca 
180 

atcaggcgag cgtgctcgtc ggcgatttcc ttttctcccg cgccttccag ctgatggtgg 
240 

aagacggctc gctcgacgcg ctgcgcattc tctcggatgc ctccgccgtg atcgcgcagg 
300 

gcgaagtgat gcagctcggc accgcgcgca atcttgaaac caatatgagc cagtatctcg 
360 

atgtgatcag cgcgaagacc gccgcgctct ttgccgccgc ctgcgaaatc ggcccggtga 
420 

tggcgaacgc gaaggcggaa gatgctgccg cgatgtgcga atacggcatg aatctcggta 
480 

tcgccttcca gatcatcgac gaccttctcg attacggcac cggcggccac gccgagcttg 
540 

gcaagaacac gggcgacgat t 
561 

<210> 11 
<211> 966 
<212> DNA 

<213> Arabidopsis sp 
<400> 11 

atggtacttg ccgaggttcc aaagcttgcc tctgctgctg agtacfctctt caaaaggggt 
60 

gtgcaaggaa aacagtttcg ttcaactatt ttgctgctga tggcgacagc tctgaafcgta 
120 

cgcgttccag aagcattgat tggggaatca acagafcatag tcacatcaga attacgcgta 
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180 

aggcaacggg gtattgctga 
240 

gatgtcttgg atgatgccga 
300 

aacaagatgt cggtafctagc 
360 

gctttaaaga acacagaggt 
420 

ggfcgaaacca tggaaataac 
480 

cagaagacat ahhataagac 
540 

chcacbggac aaacagcaga 
600 

ttagcattcc aattaataga 
660 

aagggatcgt tgtcagatat 
720 

gaagagtttc cfccaactacg 
780 

gacattgcth tagagtatct 
840 

atggaacatg cgaatctagc 
900 

gatgtcaaaa gatcgaggcg 
960 

aagtga 
966 

<210> 12 
<211> 321 
<212> FRT 

<213> Arabidopsis sp 



<400> 12 



Met 


Val 


Leu 


Ala 


Glu 


Val 


Pro 


Lys 


Leu 


Ala 


Ser 


Ala 


Ala 


Glu 


Tyr 


Phe 


1 








5 








10 










15 




Phe 


Lys 


Arg 


Gly 


Val 


Gin 


Gly 


Lys 


Gin 


Phe 


Arg 


Ser 


Thr 


He 


Leu 


Leu 








20 










25 










30 






Leu 


Met 


Ala 


Thr 


Ala 


Leu 


Asn 


Val 


Arg 


Val 


Pro 


Glu 


Ala 


Leu 


He 


Gly 






35 










40 










45 








Glu 


Ser 


Thr 


Asp 


He 


Val 


Thr 


Ser 


Glu 


Leu 


Arg 


Val 


Arg 


Gin 


Arg 


Gly 




50 










55 










60 










lie 


Ala 


Glu 


He 


Thr 


Glu 


Met 


He 


His 


Val 


Ala 


Ser 


Leu 


Leu 


His 


Asp 


65 










70 










75 










80 


Asp 


Val 


Leu 


Asp 


Asp 


Ala 


Asp 


Thr 


Arg 


Arg 


Gly 


Val 


Gly 


Ser 


Leu 


Asn 










85 










90 










95 




Val 


Val 


Met 


Gly 


Asn 


Lys 


Met 


Ser 


Val 


Leu 


Ala 


Gly 


Asp 


Phe 


Leu 


Leu 








100 










105 










110 






Ser 


Arg 


Ala 


Cys 


Gly 


Ala 


Leu 


Ala 


Ala 


Leu 


Lys 


Asn 


Thr 


Glu 


Val 


Val 




115 






120 










125 








Ala 


Leu 


Leu 


Ala 


Thr 


Ala 


Val 


Glu 


His 


Leu 


Val 


Thr 


Gly 


Glu 


Thr 


Met 




130 










135 










140 










Glu 


He 


Thr 


Ser 


Ser 


Thr 


Glu 


Gin 


Arg 


Tyr 


Ser 


Met 


Asp 


Tyr 


Tyr 


Met 


145 










150 








155 










160 


Gin 


Lys 


Thr 


Tyr 


Tyr 


Lys 


Thr 


Ala 


Ser 


Leu 


He 


Ser 


Asn 


Ser 


Cys 


Lys 






165 










170 










175 




Ala 


Val 


Ala 


Val 


Leu 


Thr 


Gly 


Gin 


Thr 


Ala 


Glu 


Val 


Ala 


Val 


Leu 


Ala 








180 










185 










190 






Phe 


Glu 


Tyr 


Gly 


Arg 


Asn 


Leu 


Gly 


Leu 


Ala 


Phe 


Gin 


Leu 


He 


Asp 


Asp 



aatcactgaa 
tacaaggcgt 
aggagacttc 
tgtagcatta 
tagttcaacc 
agcatcgcta 
agttgccgtg 
cgacattctt 
tcgccatgga 
cgaagttgtt 
tgggaagagc 
agcagctgca 
ggcacttatt 



atga tacacg 
ggtgttggfc t 
ttgcfcctccc 
ct tgcaactg 
gagcagcgtt 
atctctaaca 
ttagctth tg 
gatttcacgg 
gtcataacag 
gabcaagttg 
aagggaatac 
atcgggtctc 
gacttgaccc 



tcgcaagtct 
ccttaaatgt 
gggct tgtgg 
ctgtagaaca 
atagtatgga 
gctgcaaagc 
agtatgggag 
gcacatctgc 
ccccaatcct 
aaaaagatcc 
agagggcaag 
tacctgaaac 
atagagtcat 



actgcacgat 
tgtaa tgggt 
ggctctcgct 
tcttgttacc 
ctactacafcg 
tgt tgccgb t 
gaatctgggt 
ctctctcgga 
ctttgccatg 
taggaatgtt 
agaattagcc 
agacaatgaa 
caccagaaac 
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195 










200 










205 






He 


Leu 


Asp 


Phe 


Thr Gly 


Thr 


Ser 


Ala 


Ser 


Leu 


Gly 


Lys 


Gly Ser 


Leu 




210 










215 










220 








Ser 


Asp 


He 


Arg 


His 


Gly 


Val 


He 


Thr 


Ala 


Pro 


lie 


Leu 


Phe Ala 


Met 


225 






230 










235 








240 


Glu 


Glu 


Phe 


Pro 


Gin 


Leu 


Arg 


Glu 


Val 


Val 


Asp 


Gin 


Val 


Glu Lys 


Asp 










245 








250 








255 




Pro 


Arg 


Asn 


Val 


Asp 


He 


Ala 


Leu 


Glu 


Tyr 


Leu 


Gly Lys 


Ser Lys 


Gly 






260 








265 










270 




rie 


Gin 


Arg 


Ala 


Arg 


Glu 


Leu 


Ala 


Meb 


Glu 


His 


Ala 


Asn 


Leu Ala 


Ala 






275 








280 










285 






Ala 


Ala 


He 


Gly 


Ser 


Leu 


Pro 


Glu 


Thr 


Asp 


Asn 


Glu 


Asp 


Val Lys 


Arg 




290 










295 










300 








Ser Arg Arg 


Ala 


Leu 


He 


Asp 


Leu 


Thr 


His 


Arg 


Val 


He 


Thr Arg 


Asn 


305 










310 










315 








320 


Lys 































<210> 13 
<211> 621 
<212> DNA 

<213> Arabidopsis sp 
<400> 13 

gctttctcct ttgctaattc ttgagctttc ttgatcccac cgcgatttct aactatttca 
60 

atcgcttctt caagcgatcc aggctcacaa aactcagact caatgatctc tcttagcctt 
120 

ggctcattct ctagcgcgaa gatcactggc gccgttatgt tacctttggc taagtcatta 
180 

gctgcaggct tacctaactg ctctgtggac tgagtgaagt ccagaatgtc atcaacfcact 
240 

fcgaaaagata aaccgagaht cttcccgaac tgatacattt gctctgcgac cttgctttcg 
300 

actttactga aaattgctgc tcctttggtg cttgcagcta ctaatgaagc tgtcttgfcag 
360 

taactcttta gcatgtagtc atcaagcttg acafccacaat cgaataaact cgatgcttgc 
420 

tttatctcac cgcttgcaaa atctttgatc acctgcaaaa agataaatca agattcagac 
480 

caaatgttct ttgtattgag tagcttcatc taatctcaga aaggaatatt acctgactta 
540 

tgagcttaat gacttcaagg ttfctcgagat ttgtaagtac catgatgctt gagcaacatg 
600 

aaatccccag ctaatacagc t 
621 

<210> 14 
<2H> 741 
<212> DNA 

<213> Arabidopsis sp 
<400> 14 

gghgagtttt gttaatagtt afcgagattca fcctatttfctg fccataaaatfc gfcttggttfcg 
60 

gtttaaactc tgtgtataat tgcaggaaag gaaacagttc atgagctttt cggcacaaga 
120 

gtagcggfcgc fcagctggaga tttcatgttt gctcaagcgt catggfcacfct agcaaatctc 
180 

gagaahcfctg aagttattaa gctcatcagt caggtactta gttactctta cattgttfctt 
240 
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cfcafcgaggbt gagctabgaa tcfccafcttcg ttgaataatg ctgtgcctca aacttttttt 
300 

cabgbtbtca ggtgatcaaa gactttgcaa gcggagagat aaagcaggcg bccagctbab 
350 

ttgacbgcga caccaagctc gacgagtact bactcaaaag tttctacaag acagcctcfct 
420 

tagfcggctgc gagcaccaaa ggagcbgcca fctttcagcag agbtgagcct gatgtgacag 
480 

aacaaahgta cgagtttggg aagaatctcg gtcfcctcttfc ccagatagtt gatgatattb 
540 

tggabtbcac tcagtcgaca gagcagctcg ggaagccagc agggagtgat ttggctaaag 
600 

gbaacttaac agcacctgtg afcbttcgctc bggagaggga gccaaggcta agagagatca 
660 

ttgagtcaaa gttcfcgtgag gcgggttcfcc tggaagaagc gabtgaagcg gtgacaaaag 
720 

gtggggggat taagagagca c 
741 

<210> 15 
<211> 1087 
<212> DNA 

<213> Arabidopsis sp 
<400> 15 

cctcfctcagc caabccagag gaagaagaga caacttttta tctttcgtca agagtctccg 
60 

aaaacgcacg gttbtabgcb ctctcttctg ccctcacctc acaagacgca gggcacatga 
120 

ttcaaccaga gggaaaaagc aacgataaca acfcctgcttt tgatttcaag ctgtatatga 
180 

tccgcaaagc cgagtctgta aatgcggctc tcgacgtttc cgtaccgctt ctgaaacccc 
240 

ttacgatcca agaagcggtc aggtactctt tgctagccgg cggaaaacgt gtgaggcctc 
300 

tgcfcctgcat tgccgcttgt gagcttgtgg ggggcgacga ggctactgcc atgtcagccg 
360 

cttgcgcggt cgagatgatc cacacaagct ctctcattca tgacgatctt ccgtgcatgg 
420 

acaatgccga cctccgtaga ggcaagccca ccaatcacaa ggtatgttgt ttaattatat 
480 

gaaggctcag agataatgct gaactagtgt tgaaccaatt tttgctcaaa caaggtatat 
540 

ggagaagaca tggcggtfctt ggcaggtgat gcactccbtg cattggcgtt bgagcacatg 
600 

acggttgtgt cgagtgggtt ggtcgctccc gagaagabga ttcgcgccgt ggttgagctg 
660 

gccagggcca tagggactac agggctagtt gctggacaaa tgatagacct agccagcgaa 
720 

agacfcgaatc cagacaaggt tggattggag catctagagh bcatccabct ccacaaaacg 
780 

gcggcattgfc tggaggcagc ggcagtttta ggggfcbataa tgggaggtgg aacagaggaa 
840 

gaaafccgaaa agcbtagaaa gfcabgctagg tgbatbggac tactgtttca ggtbgfctgab 
900 

gacafcbctcg acgbaacaaa abctactgag gaabbgggta agacagccgg aaaagacgta 
960 

atggccggaa agctgacgta fcccaaggcfeg abaggtfctgg agggabccag ggaagtbgca 
1020 

gagcacctga ggagagaagc agaggaaaag cbtaaagggt ttgabccaag tcaggcggcg 
1080 
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ccbctgg 
1087 

<210> 16 
<211> 1164 
<212> DNA 

<213> Arabidopsis sp 
<400> 16 

atgacttcga bbcbcaacac tgtctccacc abccactctb ccagagttac cbccgtcgab 
60 

cgagtcggag bccbcbcbcb bcggaabbcg gattccgtbg agbbcacbcg ccggcgtbct 
120 

ggtbbctcga cgtbgabcba cgaabcaccc gggcggagab bbgbbgtgcg bgcggcggag 
180 

actgabacbg abaaagbtaa atctcagaca ccbgacaagg caccagccgg fcggttcaagc 
240 

attaaccagc btctcggtab caaaggagca fccbcaagaaa cbaataaatg gaagabtcgb 
300 

cbbcagctba caaaaccagb cactbggccb ccacbggttt ggggagbcgt cbgbggtgct 
360 

gctgcttcag ggaacbbbca bbggacccca gaggabgttg cbaagtcgab bcbbtgcabg 
420 

atgatgbcbg gtccttgtct tacbggcbab acacagacaa bcaacgactg gtatgataga 
480 

gatabcgacg caattaatga gccababcgt ccaabbccat cbggagcaab atcagagcca 
540 

gaggttatta cacaagtcbg ggbgcbabba tbgggaggtc tbggtabbgc bggaababba 
600 

gatgbgtggg cagggcatac cacbcccact gbcbbctabc bbgcbbbggg aggatcatbg 
660 

cbabcbtata batactctgc tccaccbctt aagctaaaac aaaabggatg ggbbggaaab 
720 

bttgcacbtg gagcaagcba batbagbbbg ccatggbggg cbggccaagc abbgtfcbggc 
780 

actctbacgc cagabgbtgt bgbbctaaca ctcfctgbaca gcabagctgg gbbaggaata 
840 

gccatbgtba acgactbcaa aagbgbbgaa ggagabagag cabbaggact tcagbcbctc 
900 

ccagtagcbb bbggcaccga aacbgcaaaa bggabatgcg fcbggtgcbat agacabbact 
960 

cagcbbtcbg btgccggaba tcbatbagca bcbgggaaac cbbatbabgc gbtggcgbbg 
1020 

gbtgctbbga bcabtcctca gabbgbgtbc cagbbbaaab acbtbctcaa ggaccctgbc 
1080 

aaabacgacg bcaagfcacca ggcaagcgcg cagccabtcb bggbgcbcgg aafcabttgba 
1140 

acggcatbag cabcgcaaca ctga 
1164 

<210> 17 
<211> 387 
<212> PRT 

<213> Arabidopsis sp 
<400> 17 

Meb Thr Ser He Leu Asn Thr Val Ser Thr He His Ser Ser Arg Val 

15 10 15 

Thr Ser Val Asp Arg Val Gly Val Leu Ser Leu Arg Asn Ser Asp Ser 

20 25 30 

Val Glu Phe Thr Arg Arg Arg Ser Gly Phe Ser Thr Leu He Tyr Glu 
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35 40 45 

Ser Pro Gly Arg Arg Phe Val Val Arg Ala Ala Glu Thr Asp Thr Asp 

50 55 60 

Lys Val Lys Ser Gin Thr Pro Asp Lys Ala Pro Ala Gly Gly Ser Ser 
65 70 75 80 

lie Asn Gin Leu Leu Gly lie Lys Gly Ala Ser Gin Glu Thr Asn Lys 

85 90 95 

Trp Lys He Arg Leu Gin Leu Thr Lys Pro Val Thr Trp Pro Pro Leu 

100 105 110 

Val Trp Gly Val Val Cys Gly Ala Ala Ala Ser Gly Asn Phe His Trp 

115 120 125 

Thr Pro Glu Asp Val Ala Lys Ser He Leu Cys Met Met Met Ser Gly 

130 135 140 

Pro Cys Leu Thr Gly Tyr Thr Gin Thr He Asn Asp Trp Tyr Asp Arg 
145 150 155 160 

Asp He Asp Ala He Asn Glu Pro Tyr Arg Pro lie Pro Ser Gly Ala 

165 170 175 

lie Ser Glu Pro Glu Val He Thr Gin Val Trp Val Leu Leu Leu Gly 

180 185 190 

Gly Leu Gly He Ala Gly He Leu Asp Val Trp Ala Gly His Thr Thr 

195 200 205 

Pro Thr Val Phe Tyr Leu Ala Leu Gly Gly Ser Leu Leu Ser Tyr He 

210 215 220 

Tyr Ser Ala Pro Pro Leu Lys Leu Lys Gin Asn Gly Trp Val Gly Asn 
225 230 235 240 

Phe Ala Leu Gly Ala Ser Tyr He Ser Leu Pro Trp Trp Ala Gly Gin 

245 250 255 

Ala Leu Phe Gly Thr Leu Thr Pro Asp Val Val Val Leu Thr Leu Leu 

260 265 270 

Tyr Ser He Ala Gly Leu Gly He Ala He Val Asn Asp Phe Lys Ser 

275 280 285 

Val Glu Gly Asp Arg Ala Leu Gly Leu Gin Ser Leu Pro Val Ala Phe 

290 295 300 

Gly Thr Glu Thr Ala Lys Trp lie Cys Val Gly Ala He Asp He Thr 
305 310 315 320 

Gin Leu Ser Val Ala Gly Tyr Leu Leu Ala Ser Gly Lys Pro Tyr Tyr 

325 330 335 

Ala Leu Ala Leu Val Ala Leu He He Pro Gin He Val Phe Gin Phe 

340 345 350 

Lys Tyr Phe Leu Lys Asp Pro Val Lys Tyr Asp Val Lys Tyr Gin Ala 

355 360 365 

Ser Ala Gin Pro Phe Leu Val Leu Gly He Phe Val Thr Ala Leu Ala 

370 375 380 

Ser Gin His 
385 

<210> 18 
<211> 981 
<212> DNA 

<213> Arabidopsis sp 
<400> 18 

atgttgttta gtggttcagc gabcccatta agcagctfcct gctcfccttcc ggagaaaccc 
60 

cacactcttc ctahgaaact ctctcccgct gcaatccgafc cttcatcctc atctgccccg 
120 

gggtcgttga acttcgatcfc gaggacgtat fcggacgactc tgatcaccga gatcaaccag 
180 

aagcbggatg aggccatacc ggtcaagcac cctgcgggga tctacgaggc tatgagatac 
240 

tctgtactcg cacaaggcgc caagcgtgcc cctcctghga tgtgfcgtggc ggcctgcgag 
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300 

cfccbfccggbg gcgatcgccb cgccgcbtbc cccaccgcct gbgcccbaga aabggtgcac 
360 

gcggcbtcgb tgatacacga cgaccbcccc bgbatggacg acgabccbgb gcgcagagga 
420 

aagccabcta accacacfcgb ctacggctct ggcatggcca bbctcgccgg bgacgccctc 
480 

tbcccactcg ccbtccagca cattgtcbcc cacacgccfcc cfcgaccbbgt tccccgagcc 
540 

accafcccbca gactcatcac tgagattgcc cgcacbgtcg gctccactgg tatggctgca 
600 

ggccagtacg tcgaccttga aggaggtccc tttccfccfcfct ccbbbgbbca ggagaagaaa 
660 

ttcggagcca tgggtgaatg ctcbgccgtg tgcggtggcc tattgggcgg bgccactgag 
720 

gafcgagctcc agagbctccg aaggtacggg agagccgtcg ggatgctgta fccaggtggbc 
780 

gatgacatca ccgaggacaa gaagaagagc tatgabggbg gagcagagaa gggaatgafcg 
840 

gaaatggcgg aagagctcaa ggagaaggcg aagaaggagc ttcaagtgtt tgacaacaag 
900 

tatggaggag gagacacact tgttcctctc tacaccttcg tbgacbacgc bgcbcabcga 
960 

cabbbbcttc ttcccctctg a 
981 

<210> 19 
<211> 245 
<212> DNA 
<213> GLycine sp 

<400> 19 

gcaacatcbg ggactgggbb tgtctbgggg agtggbagtg ctgtbgabct ttcggcactt 
60 

tcttgcacbt gctbgggtac catgatggtt gcbgcabctg ctaacbcttb gaabcaggtg 
120 

tttgagatca ataatgatgc taaaabgaag agaacaagtc gcaggccact acccbcagga 
180 

cgcabcacaa tacctcatgc agtbggctgg gcatcctctg tbggabtagc tggtacggcb 

240 

ctacb 

245 

<210> 20 
<211> 253 
<212> DNA 
<213> Glycine sp 

<400> 20 

afctggctttc caagabcatb gggtbbtctt gbbgcabtca bgaccttcba ctcctbgggb 
60 

fctggcattgfe ccaaggatat accbgacgtb gaaggagata aagagcacgg cafcbgattct 
120 

fcttgcagbac gbctaggtca gaaacgggca bbttggatbfc gcgbbbcctb btttgaaatg 
180 

gctfcfccggag ttggtatccb ggccggagca bcabgcbcac actbfctggac taaaabttfcc 
240 

acgggtatgg gaa 
253 

<210> 21 
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<211> 275 
<212> DNA 
<213> Glycine sp 

<400> 21 

bgabcbbcba ctcbcbgggb abggcabbgb ccaaggabab abcbgacgbb aaaggagaba 
60 

aagcatacgg catcgatact bbagcgabac gbbbgggfcca aaaabgggba tbbbggabbb 
120 

gcabbatccb tbbbgaaatg gcttfctggag bbgccctctb ggcaggagca acabctbcbb 
180 

accbbbggat baaaabbgbc acgggtcbgg gacabgcbab tcbbgcbbca abbctcbbgb 
240 

accaagccaa atctababac tbgagcaaca aagbb 
275 

<210> 22 
<211> 299 
<212> DNA 
<213> Glycine sp 

<220> 

<221> miscJFeabure 
<222> {1J . . . {299) 
<223> n = A,T,C or 6 

<400> 22 

ccanaabang bncabctbng aaagacaatt ggcctcttca acacacaagt ctgcatgtga 
60 

agaagaggcc aattgbcbtt ccaagatcac ttatngbggc babtgbaatc abgaactbcb 
120 

bctttgtggg tatggcabtg gcaaaggata tacctancbg bbgaaggaga taaaatabab 
180 

ggcattgaba cttbbgcaab acgbataggt caaaaacaag tabbbtggat bbgtatbbtc 
240 

cttttbgaaa ggcbbbcgga gtbtcccbag tggcaggagc aacatctbct agccbbggt 
299 

<210> 23 
<211> 767 
<212> DNA 
<213> Glycine sp 

<400> 23 

gbggaggctg bggbbgctgc ccbgtttabg aababtbaba btgttggttb gaabcaabbg 
60 

bctgabgtbg aaabagacaa gabaaacaag ccgbabcbbc cattagcatc tggggaabat 
120 

bccttbgaaa cfcggtgbcac tabtgtbgca bctbtfcbcaa tbcbgagttb tfcggcbbggc 
180 

tgggbtgbag gbbcabggcc abbattbtgg gcccbbtbbg baagcbbtgt gcbaggaacb 
240 

gcbtattcaa tcaabgfcgcc tcbgbtgaga tggaagaggt btgcagtgct bgcagcgabg 
300 

tgcatbcfcag cbgbbcgggc agbaabagbb caacbtgcab bbbbccbbca cabgcagact 
360 

cabgtgbaca agaggccacc tgtcbbbtca agaccabbga btfcbbgcbac bgcatbcafcg 
420 

agcbbcbfcct ctgtagtbat agcacbgbbb aaggababac ctgacabtga aggagabaaa 
480 

gtabfcbggca tccaabcbbb bfccagbgtgt tbaggfccaga agccggbgbt cbggacbbgb 



15/44 

* * 



WO 02/33060 



PCT/US01/42673 



540 

gttacccttc tbgaaabagc tbatggagtc gccctcctgg tgggagctgc atctccttgb 
600 

ctttggagca aaabttbcac gggtcbggga cacgctgfcgc tggcttcaat tcbcbggtbt 
660 

catgccaaat ctgtagattb gaaaagcaaa gcttcgataa catccttcta babgtbtabb 
720 

tggaagcbat bttatgcaga abacbtactc abbccttttg bbagatg 
161 

<210> 24 
<2I1> 255 
<212> PRT 
<213> Glycine sp 

<400> 24 

Val Glu Ala Val Val Ala Ala Leu Phe Met Asn He Tyr He Val Gly 

1 5 10 15 

Leu Asn Gin Leu Ser Asp Val Glu He Asp Lys He Asn Lys Pro Tyr 

20 25 30 

Leu Pro Leu Ala Ser Gly Glu Tyr Ser Phe Glu Thr Gly Val Thr He 

35 40 45 

Val Ala Ser Phe Ser He Leu Ser Phe Trp Leu Gly Trp Val Val Gly 

50 55 60 

Ser Trp Pro Leu Phe Trp Ala Leu Phe Val Ser Phe Val Leu Gly Thr 
65 70 75 80 

Ala Tyr Ser He Asn Val Pro Leu Leu Arg Trp Lys Arg Phe Ala Val 

85 90 95 

Leu Ala Ala Met Cys He Leu Ala Val Arg Ala Val He Val Gin Leu 

100 105 HO 

Ala Phe Phe Leu His Met Gin Thr His Val Tyr Lys Arg Pro Pro Val 

115 120 125 

Phe Ser Arg Pro Leu He Phe Ala Thr Ala Phe Met Ser Phe Phe Ser 

130 135 140 

Val Val He Ala Leu Phe Lys Asp He Pro Asp He Glu Gly Asp Lys 
145 150 155 160 

Val Phe Gly He Gin Ser Phe Ser Val Cys Leu Gly Gin Lys Pro Val 

165 170 175 

Phe Trp Thr Cys Val Thr Leu Leu Glu He Ala Tyr Gly Val Ala Leu 

180 185 190 

Leu Val Gly Ala Ala Ser Pro Cys Leu Trp Ser Lys He Phe Thr Gly 

195 200 205 

Leu Gly His Ala Val Leu Ala Ser lie Leu Trp Phe His Ala Lys Ser 

210 215 220 

Val Asp Leu Lys Ser Lys Ala Ser lie Thr Ser Phe Tyr Met Phe He 
225 230 235 240 

Trp Lys Leu Phe Tyr Ala Glu Tyr Leu Leu He Pro Phe Val Arg 

245 250 255 

<210> 25 
<211> 360 
<212> DNA 
<213> Zea sp 

<220> 

<221> misc__f eature 
<222> (1) . . . (360) 
<223> n - A,T,C or G 

<400> 25 

ggcgtctfcca cttgbtctgg tcttctcgfca tcccctgatg aagaggttca cattttggcc 
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60 

tcaggcttab ctbggcctga cattcaactg gggagcttta cbagggtggg ctgctatbaa 
120 

ggaaagcafca gacccbgcaa abcatcctbc cabbgtabac agctggbabb tgttggacgc 
180 

tggtgbabga tactatatat gcgcatcagg bgbbbcgcba fccccbacttb cabatbaatc 
240 

ctbgabgaag bggccatfctc atgtbgtcgc ggtggfcctta bacfcfcgcata bcbccafcgca 
300 

tcbcaggaca aagangafcga cctgaaagba ggagbccaag bccacagctt aagabbbggg 
360 

<210> 26 
<211> 299 
<212> DNA 
<213> Zea sp 

<220> 

<221> misc feature 
<222> (1) .7. (299) 
<223> n = A,T,C or G 

<400> 26 

gabggttgca gcafcctgcaa ataccctcaa ccaggtgttt gngataaaaa atgatgctaa 
60 

aatgaaaagg acaatgcgtg ccccctgcca tctggtcgca ttagtcctgc acatgcbgcg 
120 

atgtgggcta caagtgttgg agttgcagga acagctttgt tggcctggaa ggcbaabggc 
180 

ttggcagcbg ggcbfcgcagc ttctaabctt gttcbgtatg cattbgbgba bacgccgbtg 
240 

aagcaaatac accctgttaa tacatgggtt ggggcagtcg ttggtgccat cccaccact 
299 

<210> 27 
<211> 255 
<212> DNA 
<213> Zea sp 

<220> 

<221> misc_feature 

<222> (1) . . . (255) 

<223> n « A,T,C or G 

<400> 27 

anacttgcat atctccatgc ntctcaggac aaagangatg acctgaaagt aggtgtcaag 
60 

tccacagcat taagatfctgg agatbtgacc nnatactgna tcagtggctt tggcgcggca 
120 

bgcttcggca gctbagcact cagtggbbac aatgctgacc tfcggttggtg tttagtgtga 
180 

tgcbtgagcg aagaatggta tngtttttac tbgabattga ctccagacct gaaatcatgt 
240 

bggacagggb ggccc 
255 

<210> 28 
<211> 257 
<212> DMA 
<213> Zea sp 
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<400> 28 - - - 

afctgaagggg ataggactct ggggcttcag tcacttcctg ttgctbbbgg gatggaaact 

60 

gcaaaatgga tbtgtgttgg agcaabbgat atcactcaat fcatctgttgc aggttaccfca 
120 

ttgagcaccg gbaagctgta ttatgccctg gtgbbgctbg ggctaacaab bcctcaggbg 
180 

bbcbttcagt tccagtacbt cctgaaggac cctgtgaagb afcgatgtcaa abatcaggca 
240 

agcgcacaac catbctt 
25"? 

<210> 29 
<211> 368 
<212> DNA 

<213> Zea sp ' ' 

<400> 29 

atccagttgc aaataataat ggcgttcttc fcctgttgtaa tagcactatt caaggatata 
60 

ccbgacatcg aaggggaccg catatfccggg atccgafccct tcagcgtccg gttagggcaa 
120 

aagaaggtct tttggatctg cgttggcttg cttgagatgg cctacagcgt tgcgatactg 
180 

atgggagcta cctcttcctg bttgbggagc aaaacagcaa ccafccgctgg ccattccata 
240 

ctbgccgcga tccbatggag ctgcgcgcga tcggtggacb tgacgagcaa agccgcaata 
300 

acgtccttct acatgbtcat ctggaagctg ttctacgcgg agtacctgct catccctctg 
360 

gtgcggtg 
368 

<210> 30 
<211> 122 
<212> PRT 
<213> Zea sp 

<400> 30 



He 


Gin 


Leu 


Gin 


lie 


He 


Met 


Ala 


Phe 


Phe 


Ser 


Val 


Val 


He 


Ala 


Leu 


1 








5 










10 










15 




Phe 


Lys 


Asp 


Tie 
20 


Pro 


Asp 


He 


Glu 


Gly 
25 


Asp 


Arg 


He 


Phe 


Gly 
30 


He 


Arg 


Ser 


Phe 


Ser 


Val 


Arg 


Leu 


Gly Gin 


Lys 


Lys 


Val 


Phe Trp 


He 


Cys 


Val 






35 










40 










45 








Gly Leu 


Leu 


Glu 


Met 


Ala 


Tyr 


Ser 


Val 


Ala 


He 


Leu 


Met 


Gly Ala 


Thr 




50 










55 










60 










Ser 


Ser 


Cys 


Leu 


Trp 


Ser 


Lys 


Thr 


Ala 


Thr 


He 


Ala 


Gly 


His 


Ser 


He 


65 










70 










75 










80 


Leu 


Ala 


Ala 


He 


Leu 
85 


Trp 


Ser 


Cys 


Ala 


Arg 
90 


Ser 


Val 


Asp 


Leu 


Thr 
95 


Ser 


Lys 


Ala 


Ala 


He 
100 


Thr 


Ser 


Phe 


Tyr 


Met 
105 


Phe 


He 


Trp 


Lys 


Leu 
110 


Phe 


Tyr 


Ala 


Glu 


Tyr Leu 


Leu 


He 


Pro 


Leu 


Val 


Arg 















115 120 



<210> 31 
<211> 278 
<212> DMA 
<213> Zea sp 
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<400> 31 ' ■ 

tattcagcac cacctctcaa gchcaagcag aatggatgga ttgggaactt cgctctgggt 
60 

gcgagtfcaca fccagcttgcc ctggfcgggct ggccaggcgt tatttggaac tcttacacca 
120 

gatatcattg tcttgactac tttgtacagc atagctgggc tagggattgc tattgtaaat 
180 

gatttcaaga gtatbgaagg ggataggact ctggggcttc agtcactfccc tgttgctttt 
240 

gggatggaaa ctgcaaaatg gatttgtgtt ggagcaat 
278 

<210> 32 
<211> 292 
<212> PRT 

<213> Synechocystis sp 
<400> 32 

Met Val Ala Gin Thr Pro Ser Ser Pro Pro Leu Trp Leu Thr He He 

15 10 15 

Tyr Leu Leu Arg Trp His Lys Pro Ala Gly Arg Leu He Leu Met He 

20 25 30 

Pro Ala Leu Trp Ala Val Cys Leu Ala Ala Gin Gly Leu Pro Pro Leu 

35 40 45 

Pro Leu Leu Gly Thr He Ala Leu Gly Thr Leu Ala Thr Ser Gly Leu 

50 55 60 

Gly Cys Val Val Asn Asp Leu Trp Asp Arg Asp He Asp Pro Gin Val 
65 70 75 80 

Glu Arg Thr Lys Gin Arg Pro Leu Ala Ala Arg Ala Leu Ser Val Gin 

85 90 95 

Val Gly He Gly Val Ala Leu Val Ala Leu Leu Cys Ala Ala Gly Leu 

100 105 HO 

Ala Phe Tyr Leu Thr Pro Leu Ser Phe Trp Leu Cys Val Ala Ala Val 

115 120 125 

Pro Val He Val Ala Tyr Pro Gly Ala Lys Arg Val Phe Pro Val Pro 

130 135 140 

Gin Leu Val Leu Ser He Ala Trp Gly Phe Ala Val Leu He Ser Trp 
145 150 155 160 

Ser Ala Val Thr Gly Asp Leu Thr Asp Ala Thr Trp Val Leu Trp Gly 

165 170 175 

Ala Thr Val Phe Trp Thr Leu Gly Phe Asp Thr Val Tyr Ala Met Ala 

180 185 190 

Asp Arg Glu Asp Asp Arg Arg He Gly Val Asn Ser Ser Ala Leu Phe 

195 200 205 

Phe Gly Gin Tyr Val Gly Glu Ala Val Gly He Phe Phe Ala Leu Thr 

210 215 220 

He Gly Cys Leu Phe Tyr Leu Gly Met lie Leu Met Leu Asn Pro Leu 
225 230 235 240 

Tyr Trp Leu Ser Leu Ala He Ala He Val Gly Trp Val lie Gin Tyr 

245 250 255 

He Gin Leu Ser Ala Pro Thr Pro Glu Pro Lys Leu Tyr Gly Gin He 

260 265 270 

Phe Gly Gin Asn Val He He Gly Phe Val Leu Leu Ala Gly Met Leu 

275 280 285 

Leu Gly Trp Leu 
290 



<210> 33 
<211> 316 
<212> PRT 

<213> Synechocystis sp 



19/44. 

» * • * 



WO 02/33060 



PCT/US01/42673 



<400> 33 



Met 


Val 


Thr 


Ser 


m 1 

Thr 


Lys 


He 


His 


Arg 


Gin 


Hxs 


Asp 


Ser 


Met 


Gly 


Aj.a 


1 








5 








10 










15 




Val 


Cys 


Lys 


Ser 


Tyr 


Tyr 


Gin 


Leu 


Thr 


Lys 


Pro 


Arg 


He 


He 


Pro 


T All 

Leu 




20 










25 










30 






Leu 


Leu 


lie 
35 


Thr 


Thr 


Ala 


Ala 


Ser 
40 


Met 


Trp 


He 


Ala 


Ser 
45 


Glu 


Gly 


Arg 


Val 


Asp 


Leu 


Pro 


Lys 


Leu 


Leu 


He 


Thr 


Leu 


Leu 


Gly Gly Thr Leu 


Ala^ 




50 










55 










60 










Ala 


Ala 


Ser 


Ala 


Gin 


Thr 


Leu 


Asn 


Cys 


He 


Tyr 


Asp 


Gin 


Asp 


He 


Asp 


65 










70 










75 










80 


Tyr 


Glu 


Met 


Leu 


Arg 


Thr 


Arg 


Ala 


Arg 


Pro 


He 


Pro 


Ala 


Gly 


Lys 


Val 








85 










90 










95 




Gin 


Pro 


Arg 


His 


Ala 


Leu 


He 


Phe 


Ala 


Leu 


Ala 


Leu Gly 


Val 


Leu 


Ser 






100 










105 










110 






Phe 


Ala 


Leu 


Leu 


Ala 


Thr 


Phe 


Val 


Asn 


Val 


Leu 


Ser 


Gly Cys 


Leu 


Ala 






115 










120 










125 








Leu 


Ser 


Gly 


lie 


Val 


Phe 


Tyr Met 


Leu 


Val 


Tyr 


Thr 


His 


Trp 


Leu 


Lys 




130 








135 










140 










Arg 


His 


Thr 


Ala 


Gin 


Asn 


He 


Val 


He Gly 


Gly 


Ala 


Ala 


Gly 


Ser 


He 


145 










150 










155 










160 


Pro 


Pro 


Leu 


Val Gly Trp Ala Ala 


Val 


Thr 


Gly 


Asp 


Leu 


Ser 


Trp 


Thr 










165 










170 










175 




Pro 


Trp 


Val 


Leu 


Phe 


Ala 


Leu 


He 


Phe 


Leu 


Trp 


Thr 


Pro 


Pro 


His 


Phe 






180 










185 










190 






Trp Ala 


Leu 


Ala 


Leu 


Met 


He 


Lys 


Asp Asp 


Tyr 


Ala 


Gin 


Val 


Asn 


Val 






195 










200 










205 








Pro 


Met 
210 


Leu 


Pro 


Val 


He 


Ala 
215 


Gly 


Glu 


Glu 


Lys 


Thr 
220 


Val 


Ser 


Gin 


He 


Trp 


Tyr 


Tyr 


Ser 


Leu 


Leu 


Val 


Val 


Pro 


Phe 


Ser 


Leu 


Leu 


Leu 


Val 


Tyr 


225 






230 










235 










O A ft 

<£ 4 U 


Pro 


Leu 


His 


Gin 


Leu 


Gly 


He 


Leu 


Tyr 


Leu 


Ala 


He 


Ala 


He 


lie 


Leu 










245 








250 










255 




Gly 


Gly 


Gin 


Phe 


Leu 


Val 


Lys 


Ala 


Trp 


Gin 


Leu 


Lys 


Gin 


Ala 


Pro 


Gly 




260 










265 










270 






Asp Arg 


Asp 


Leu 


Ala 


Arg 


Gly 


Leu 


Phe 


Lys 


Phe 


Ser 


He 


Phe 


Tyr 


Leu 






275 










280 










285 








Met 


Leu 


Leu 


Cys 


Leu 


Ala 


Met 


Val 


He 


Asp 


Ser 


Leu 


Pro 


Val 


Thr 


His 




290 








295 










300 










Gin 


Leu 


Val 


Ala 


Gin 


Met 


Gly 


Thr 


Leu 


Leu 


Leu 


Gly 











305 310 315 



<210> 34 
<211> 324 
<212> PRT 

<213> Synechocystis sp 



<400> 34 



Met 


Ser 


Asp 


Thr 


Gin 


Asn 


Thr Gly Gin Asn 


Gin 


Ala 


Lys 


Ala 


Arg 


Gin 


1 






5 










10 










15 




Leu 


Leu 


Gly 


Met 


Lys 


Gly Ala 


Ala 


Pro 


Gly 


Glu 


Ser 


Ser 


He 


Trp 


Lys 






20 










25 










30 






He 


Arg 


Leu 


Gin 


Leu 


Met 


Lys 


Pro 


He 


Thr 


Trp 


He 


Pro 


Leu 


He 


Trp 




35 










40 










45 








Gly Val 


Val 


Cys 


Gly 


Ala 


Ala 


Ser 


Ser 


Gly 


Gly 


Tyr 


He 


Trp 


Ser 


Val 




50 








55 










60 










Glu 


Asp 


Phe 


Leu 


Lys 


Ala 


Leu 


Thr 


Cys 


Met 


Leu 


Leu 


Ser 


Gly 


Pro 


Leu 


65 






70 










75 










80 


Met 


Thr 


Gly 


Tyr 


Thr 


Gin 


Thr 


Leu 


Asn 


Asp 


Phe 


Tyr Asp 


Arg 


Asp 


He 






85 










90 










95 
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7\ _ ^ ft 1 

Asp Ala 


lie 


Asn 


Glu 


Pro 


Tyr 


Arg 






100 










Val Pro 


Gin 


Val 


Val 


Thr 


Gin 


lie 




115 










120 


Gly Val 


Ala 


Tyr 


Gly 


Leu 


Asp 


Val 


130 










135 




Met Met 


Val 


Leu 


Thr 


Leu 


Gly Gly 


145 








150 






Ala Pro 


Pro 


Leu 


Lys 


Leu 


Lys 


Gin 








165 








Leu Gly 


Ala 


Ser 


Tyr 


lie 


Ala 


Leu 




180 










Phe Gly 


Thr 


Leu 


Asn 


Pro 


Thr 


He 


195 










200 


Leu Ala 


Gly 


Leu 


Gly 


He 


Ala 


Val 


210 










215 




Gly Asp 


Arg 


Gin 


Leu 


Gly 


Leu 


Lys 


225 








230 






Gly Thr 


Ala 


Ala 


Trp 


He 


Cys 


Val 








245 








Gly lie 


Ala 


Gly 


Tyr Leu 


He 


Tyr 






s\ ?~ r\ 
260 










lie Val 


Leu 


Leu 


Leu 


Leu 


He 


Pro 




275 










280 


Phe Leu 


Arg 


Asn 


Pro 


Leu 


Glu 


Asn 


290 








295 




Gin Pro 


Phe 


Leu 


Val 


Phe 


Gly 


Met 


305 








310 






His Ala 


Gly 


He 











<210> 35 
<211> 307 
<212> PRT 

<213> Synechocystis sp 
<400> 35 



Met Thr 


Glu 


Ser 


Ser 


Pro 


Leu 


Ala 


1 






5 








Lys Leu 


Trp 


Leu 
20 


Ala 


Ala 


He 


Lys 


Val Pro 


He 
35 


Thr 


Val 


Gly 


Ser 


Ala 
40 


Trp His 


Gly 


Asp 


Val 


Phe 


Thr 


He 


50 










55 




He Ala 


Trp 


He 


Asn 


Leu 


Ser 


Asn 


65 






70 






He Asp 


Val 


Arg 


Lys 
85 


Ala 


His 


Ser 


Asn Leu 


Val 


Phe 
100 


Leu 


He 


Ser 


Asn 


Gly Leu 


Met 


Ser 


Met 


Ser 


Trp Arg 


115 










120 


Leu He 


Gly 


Val 


Ala 


He 


Phe 


Leu 


130 










135 




Phe Arg 


Leu 


Gly 


Tyr 


Leu 


Gly 


Leu 


145 








150 






Phe Gly 


Pro 


Leu 


Ala 


He 


Ala 


Ala 






165 








Phe Ser 


Trp 


Asn 
130 


Leu 


Leu 


Thr 


Pro 



Pro 


He 


Pro 




Gly Ala 


lie 


oer 












110 






T A • t 

Leu 


T T a 

He 


T A.I 

Leu 


Leu 


Val 


Ala 


biy 


lie 










125 








Trp 


Ala 


Gin 


His 


Asp 


Phe 


D v *~i 


lie 








140 










Ala 


Phe 


Val 


Ala 


Tyr 


He 


Tyr 


Ser 






155 










1 bU 


Asn 


Gly 


Trp 


Leu 


Gly Asn 


Tyr 


Ala 




170 










1 / 0 




Pro 


Trp 


Trp Ala Gly 


HlS 


Ala 


Leu 


18 5 










190 






Met 


if. T 

Val 


Leu 


Thr 


Leu 


He 


iyr 


oer 










205 








Val 


?\ A A 

Asn 


Asp 


Phe 


Lys 


Ser 


vai 


blU 








220 










Ser 


Leu 


Pro 


Val 


Met 


Phe 


(jly 


T T a 

lie 






235 












He 


Met 


He 


Asp 


val 


Phe 


Gin 


7\ "1 A 

Ala 




OCA 

250 














Val 


His 


Gin 


Gin 


Leu 


Tyr 


mm* +\ 

Ala 


Thr 


265 










270 






Gin 


He 


Thr 


Phe 


Gin 


Asp 


Met 


Tyr 










285 








Asp 


Val 


Lys 


Tyr 


Gin 


Ala 


Ser 


Ala 








300 










Leu 


Ala 


Thr 


Gly 


Leu 


Ala 


Leu 


Gly 






315 










320 



Pro 


Ser 


Thr 


Ala 


Pro 


Ala 


Thr Arg 




10 










15 




Pro 


Pro 


Met 


Tyr 


Thr 


Val 


Ala 


Val 


25 










30 






Val 


Ala 


Tyr Gly Leu 


Thr 


Gly 


Gin 










45 








Phe 


Leu 


Leu 


Ser 
60 


Ala 


He 


Ala 


He 


Asp 


Val 


Phe 


Asp 


Ser Asp 


Thr 


Gly 






75 










80 


Val 


Val 


Asn 


Leu 


Thr 


Gly 


Asn Arg 




90 










95 




Phe 


Phe 


Leu 


Leu 


Ala 


Gly 


Val 


Leu 


105 










110 






Ala 


Gin 


Asp 


Trp 


Thr 
125 


Val 


Leu 


Glu 


Gly Tyr 


Thr 


Tyr 


Gin 


Gly 


Pro 


Pro 








140 










Gly Glu 


Leu 


He 


Cys 


Leu 


He 


Thr 






155 










160 


Ala 


Tyr 


Tyr 


Ser 


Gin 


Ser 


Gin 


Ser 




170 








175 




Ser 


Val 


Phe 


Val 


Gly 


He 


Ser 


Thr 


185 










190 
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Ala lie lie Leu Phe Cys Ser His Phe His Gin Val Glu Asp Asp Leu 

195 200 205 

Ala Ala Gly Lys Lys Ser Pro He Val Arg Leu Gly Thr Lys Leu Gly 

210 215 220 

Ser Gin Val Leu Thr Leu Ser Val Val Ser Leu Tyr Leu He Thr Ala 
225 230 235 240 

He Gly Val Leu Cys His Gin Ala Pro Trp Gin Thr Leu Leu He He 

245 250 255 

Ala Ser Leu Pro Trp Ala Val Gin Leu He Arg His Val Gly Gin Tyr 

260 265 270 

His Asp Gin Pro Glu Gin Val Ser Asn Cys Lys Phe He Ala Val Asn 

275 280 285 

Leu His Phe Phe Ser Gly Met Leu Met Ala Ala Gly Tyr Gly Trp Ala 

290 295 300 

Gly Leu Gly 
305 

<210> 36 
<211> 927 
<212> DNA 

<213> Synechocystis sp 
<400> 36 

atggcaacta tccaagcttt ttggcgcttc tcccgccccc ataccatcat tggtacaact 
60 

ctgagcgtct gggctgtgta tctgttaact attctcgggg atggaaactc agttaactcc 
120 

cctgcttccc tggatttagt gttcggcgct tggctggcct gcctgttggg taatgtgtac 
130 

attgtcggcc tcaaccaatt gtgggatgtg gacattgacc gcatcaataa gccgaahhtg 
240 

cccctagcta acggagattt ttctatcgcc cagggccgtt ggattgtggg actttgtggc 
300 

gttgcttcct tggcgatcgc ctggggatta gggctatggc tggggcfcaac ggtgggcatt 
360 

agtttgatta ttggcacggc ctattcggtg ccgccagtga ggttaaagcg cttttccctg 
420 

ctggcggccc tgtgtattct gacggtgcgg ggaattgtgg ttaactbggg cttattttta 
480 

ttttttagaa ttggtttagg ttatcccccc actttaataa cccccatctg ggttttgact 
540 

ttatttatct tagttttcac cgtggcgatc gccattfctta aagatgtgcc agatatggaa 
600 

ggcgatcggc aatttaagat tcaaacttta actttgcaaa tcggcaaaca aaacgttttt 
660 

cggggaacct taattttact cactggttgt tatttagcca tggcaatctg gggcttatgg 
720 

gcggctatgc ctttaaatac tgctttcttg afctgtttccc afcfctgtgctt attagcctta 
780 

cfcctggtggc ggagtcgaga tgtacacbta gaaagcaaaa ccgaaattgc tagfcttttat 
840 

cagtttattt ggaagctatt tttcttagag tacttgctgt atccctfcggc tcfcgtggtta 
900 

cctaattttt ctaatactat ttttfcag 
927 

<210> 37 
<211> 308 
<212> PRT 

<213> Synechocystis sp 
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<400> 37 

Met Ala Thr lie Gin Ala Phe Trp Arg Phe Ser Arg Pro His Thr lie 

15 10 15 

He Gly Thr Thr Leu Ser Val Trp Ala Val Tyr Leu Leu Thr Xle Leu 

20 25 30 

Gly Asp Gly Asn Ser Val Asn Ser Pro Ala Ser Leu Asp Leu Val Phe 

35 40 45 

Gly Ala Trp Leu Ala Cys Leu Leu Gly Asn Val Tyr He Val Gly Leu 

50 55 60 

Asn Gin Leu Trp Asp Val Asp He Asp Arg He Asn Lys Pro Asn Leu 
65 70 75 80 

Pro Leu Ala Asn Gly Asp Phe Ser He Ala Gin Gly Arg Trp He Val 

85 90 95 

Gly Leu Cys Gly Val Ala Ser Leu Ala He Ala Trp Gly Leu Gly Leu 

100 105 110 

Trp Leu Gly Leu Thr Val Gly He Ser Leu He He Gly Thr Ala Tyr 

115 120 125 

Ser Val Pro Pro Val Arg Leu Lys Arg Phe Ser Leu Leu Ala Ala Leu 

130 135 140 

Cys He Leu Thr Val Arg Gly He Val Val Asn Leu Gly Leu Phe Leu 
145 150 155 160 

Phe Phe Arg He Gly Leu Gly Tyr Pro Pro Thr Leu He Thr Pro lie 

165 170 175 

Trp Val Leu Thr Leu Phe He Leu Val Phe Thr Val Ala He Ala He 

180 185 190 

Phe Lys Asp Val Pro Asp Met Glu Gly Asp Arg Gin Phe Lys He Gin 

195 200 205 

Thr Leu Thr Leu Gin He Gly hys Gin Asn Val Phe Arg Gly Thr Leu 

210 215 220 

He Leu Leu Thr Gly Cys Tyr Leu Ala Met Ala He Trp Gly Leu Trp 
225 230 235 240 

Ala Ala Met Pro Leu Asn Thr Ala Phe Leu He Val Ser His Leu Cys 

245 250 255 

Leu Leu Ala Leu Leu Trp Trp Arg Ser Arg Asp Val His Leu Glu Ser 

260 265 270 

Lys Thr Glu He Ala Ser Phe Tyr Gin Phe He Trp Lys Leu Phe Phe 

275 280 285 

Leu Glu Tyr Leu Leu Tyr Pro Leu Ala Leu Trp Leu Pro Asn Phe Ser 

290 295 300 

Asn Thr He Phe 
305 

<210> 38 
<211> 1092 
<212> DNA 

<213> Synechocystis sp 
<400> 38 

atgaaatttc cgccccacag tggttaccat tggcaaggtc aatcaccttt ctttgaaggt 
60 

tggtacgtgc gcctgctttt gccccaatcc ggggaaagtt ttgcttttat gtactccatc 
120 

gaaaafccctg ctagcgatca tcatfcacggc ggcggtgctg tgcaaatttt agggccggct 
180 

acgaaaaaac aagaaaatca ggaagaccaa cttgtttggc ggacatttcc ctcggtaaaa 
240 

aaattttggg ccagtcctcg ccagtttgcc ctagggcatt ggggaaaatg tagggataac 
300 

aggcaggcga aacccctact ctccgaagaa ttttttgcca cggtcaagga aggttatcaa 
360 

atccatcaaa atcagcacca aggacaaatc attcatggcg atcgccattg tcgttggcag 
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Saccgtag a.ccg.aagt aactt.ggg agtcctaacc 

- tggcfctt cctttttaCC Cttgtttg3t CC^e^ ^.C^ 3.CCC 

Uc- atggCagagg g-agtatg — cgccct 

t ccgaaa aaaa^.. t^ctccttt ccctcc.ct 

tStttcet, acc at cc agg ac tgagCgt c act.cc.ctg g.tfttc 

555 , r tttaattggc ttacatcacc aaggtaattt ttacgaattt 

gg tc g cccc g a a gagg ta g c tttaattggc 

•7an ». afreet oaggccgttg g c a attaaaa 

7 =*-nnracaat cactt gg caa gtagctccct gyyy y 

qqcccgggcc atggcacagu i-av. 

5^..* «c gg » a .a W u«. 

ISuttt* »« 959 »« «""- ta9 t9Ca • 9995, "°" 

••*""« t5 »" 9999t tl " e,9a " aa,attt9a9 



1080 
gtgccattct ga 

1092 

<210> 39 
<211> 363 
<212> PRT 

<213> Synechocystis sp 



<400> 39 G1 xyr His Trp Gin Gly Gin Ser Pro 

Met Lys Phe Pro Pro His Ser Gly y 15 

1 5 T^n Lpii Pro Gin Ser Gly Giu 

pL Phe Glu Gly Trp Tyr v.l Arg Leu Leu Leu ^ 

S Me t Tvr Ser lie Glu Asn Pro Ala Ser Asp Hxs His 
Ser Phe Ala Phe Met Tyr ser ix 45 

35 , „ i rin lie Leu Gly Pro Ala Thr Lys Lys Gin 

Tyr Gly Gly Gly Ala val g n XI L y ^ 

Glu Asn Gin Glu Asp Gin Leu v a l Trp Ar g Thr Ph 

65 70 oha Ala Leu Gly His Trp Gly Lys 

lis Phe Trp Ala Ser Pro Ar g Gin Phe Ala Leu y ^ 

Lys Phe i.p 90 phe phe 

Cys Arg Asp Asn Arg Gin Ala Lys Pro Leu Leu 

10° ^ tio His Gin Asn Gin His Gin Gly 

Al. Thr Val Lys Glu Gly Tyr Gin He Hx. 125 

-\*\ c > 1 * rin Phe Thr Val Glu 

Gin lie I^e His Gly Asp Arg His Cys Arg Trp Gin 

130 135 „ » a™ Phe Pro Arg Ala Thr Ala 

a r ;:: m i: h :: r: :. ; s « ^ - - s 
: - :;: i s ». - - 3 s - - - a - - 

180 i" Glu Lys Asn Trp Gly His 

Tyr Glu Phe Asp His Ala Leu Val Tyr Ala ^ 
Tyr <»xu 200 phe Pr0 Asp 

Ser Phe Pro Ser Ar 9 Trp Phe Trp Leu Gin Ala AS Q 

210 rw Leu Ser Val Thr Ala Ala Gly Gly Glu Arg He Val Leu 

His Pro Gly Leu ber 
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225 230 235 240 



Glv 


ATfT 


Pro 


Glu 

V3 J* U 


Glu 


Val 


nj.a 


T iP 1 1 


X X c 


Gl v 


T.PH 1 


Hi <s 
it j. ^ 


Hi q 

nig 


m n 

VJ JL 1 1 


Glv 


L \ Oil 










24 5 










J V 










25 5 




Ph <=» 


T vr* 


Glu 


Phe 


Glv 


Pro 


Glv 


His 


Glv 


Thr 


Val 


Thr 




Gin 


Val 

' W JL 


Ala 








260 










* U J 










270 






Pro 


* — t* 


Glv 


Ara 


TrD 


Gin 


Leu 


Lvs 


Ala 


Ser 


Asn 


Asd 


Ara 


Tvr 


TrD 


Val 

v W JL 






275 










280 










285 

tm \J 








Lys 


Leu 


Ser 


Glv 


Lys 


Thr 


Asd 


Lvs 


Lvs 


Glv 


Ser 

Vp* ^m. 


Leu 


Val 


His 


Thr 


Pro 




290 










295 










300 










Thr 


Ala 


Gin 


Gly 


Leu 


Gin 


Leu 


Asn 


Cys 


Arg 


Asp 


Thr 


Thr 


Arg 


Gly 


Tyr 


305 










310 










315 










320 


Leu 


Tyr 


Leu 


Gin 


Leu 


Gly 


Ser 


Val 


Gly 


His 


Gly 


Leu 


He 


Val 


Gin 


Gly 










325 










330 










335 




Glu 


Thr 


Asp 


Thr 


Ala 


Gly 


Leu 


Glu 


Val 


Gly 


Gly 


Asp 


Trp 


Gly 


Leu 


Thr 








340 










345 










350 






Glu 


Glu 


Asn 


Leu 


Ser 


Lys 


Lys 


Thr 


Val 


Pro 


Phe 













355 360 



<210> 40 
<211> 56 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 40 

cgcgatttaa atggcgcgcc ctgcaggcgg ccgcctgcag ggcgcgccat ttaaat 
56 

<210> 41 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 41 

tcgaggatcc gcggccgcaa gcttcctgca gg 
32 

<210> 42 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 42 

tcgacctgca ggaagcttgc ggccgcggat cc 
32 

<210> 43 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
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<400> 43 

fccgaccfcgca ggaagcttgc ggccgcggat cc 
32 

<210> 44 
<211> 32 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 44 

tcgaggatcc gcggccgcaa gcttcctgca gg 
32 

<210> 45 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 45 

tcgaggatcc gcggccgcaa gcttcctgca ggagct 
36 

<210> 46 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 46 

cctgcaggaa gcttgcggcc gcggatcc 
28 

<210> 47 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 47 

tcgacctgca ggaagcttgc ggccgcggat ccagct 
36 

<210> 48 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
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<400> 48 

ggatccgcgg ccgcaagctt cctgcagg 
28 

<210> 49 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 49 

gatcacctgc aggaagcttg cggccgcgga tccaatgca 
39 

<210> 50 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 50 

ttggatccgc ggccgcaagc ttcctgcagg t 
31 

<210> 51 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 51 

ggatccgcgg ccgcacaatg gagtctctgc tctctagttc t 
41 

<210> 52 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 52 

ggatcctgca ggtcacttca aaaaaggtaa cagcaagt 
38 

<210> 53 
<211> 45 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 53 
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ggatccgcgg ccgcacaatg gcgbttthtg ggctctcccg tgbtt 
45 

<210> 54 
<211> 40 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 54 

ggatcctgca ggttattgaa aacttcttcc aagtacaact 
40 

<210> 55 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 55 

ggatccgcgg ccgcacaatg tggcgaagat ctgttgtt 
38 

<210> 56 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 56 

ggatcctgca ggtcatggag agtagaagga aggagct 
37 

<210> 57 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 57 

ggatccgcgg ccgcacaatg gtacttgccg aggttccaaa gcttgcctct 
50 

<210> 58 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 58 

ggatcctgca ggtcacttgt ttctggtgat gactctat 
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38 

<210> 59 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 59 

ggatccgcgg ccgcacaatg acttcgattc tcaacact 
38 

<210> 60 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 60 

ggatcctgca ggtcagtgtt gcgatgctaa tgccgt 
36 

<210> 61 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 61 

taatgtgtac attgtcggcc tc 
22 

<210> 62 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 62 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt ccacaattcc ccgcaccgtc 
60 

<210> 63 
<211> 22 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 63 

aggctaataa gcacaaatgg ga 
22 
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<210> 64 
<211> 63 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<4 00> 64 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggaattgg tttaggttat 
60 

ccc 

63 

<210> 65 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 65 

ggatccatgg ttgcccaaac cccatc 
26 

<210> 66 
<211> 61 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 66 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt gggtaagcaa caatgaccgg 

60 

c 

1 

<210> 67 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 67 

gaattctcaa agccagccca gtaac 
25 

<210> 68 
<211> 63 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
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<4Q0> 68 

ggtafcgagtc agcaacacct tcttcacgag gcagacctca gcgggfcgcga aaagggtttt 

60 

ccc 

63 

<210> 69 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 69 

ccagtggtt t aggctgtgtg gtc 
23 

<210> 70 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 70 

ctgagttgga tgtattggat c 
21 

<210> 71 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 71 

ggatccatgg ttacttcgac aaaaatcc 
28 

<210> 72 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 72 

gcaatgfcaac atcagagatt ttgagacaca acgtggcttt gctaggcaac cgcttagtac 
60 

<210> 73 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
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<400> 73 

gaattcttaa cccaacagta aagttccc 
28 

<210> 74 
<211> 63 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
<400> 74 

ggtatgagtc agcaacacct tcttcacgag 

60 

atg 

63 

<210> 75 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
<400> 75 

ggaacccttg cagccgcttc 
20 

<210> 76 
<211> 22 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
<400> 76 

gtatgcccaa ctggtgcaga gg 
22 

<210> 77 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial 
<400> 77 

ggatccatgt ctgacacaca aaataccg 
28 

<210> 78 
<211> 62 
<212> DNA 

<213> Artificial Sequence 
<220> 



Sequence : Oligonucleotide 
gcagacctca gcgccggcat tgtcttttac 



Sequence : Oligonucleotide 



Sequence : Oligonucleotide 



Sequence: Oligonucleotide 
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<223> Description of Artificial Sequence: Oligonucleotide 

<400> 78 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt cgccaatacc agccaccaac 

60 

ag 

62 

<210> 79 

<211> 27 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 

<400> 79 

gaattctcaa atccccgcat ggcctag 
27 

<210> 80 

<211> 65 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 

<400> 80 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggcctacg gcttggacgt 
60 

gtggg 
65 

<210> 81 

<211> 21 

<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 

<400> 81 

cacttggatt cccctgatct g 
21 

<210> 82 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 

<400> 82 

gcaatacccg cttggaaaac g 
21 

<210> 83 

<211> 29 

<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 83 

ggatccatga ccgaatcttc gcccctagc 
29 

<210> 84 
<211> 61 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 84 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt caatcctagg tagccgaggc 
60 



<210> 85 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 85 

gaattcttag cccaggccag cccagcc 
27 

<210> 86 
<211> 66 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 86 

ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggggaatt gatttgttta 
60 

attacc 
66 

<210> 87 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 87 

gcgatcgcca ttatcgcfctg g 
21 
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<210> 88 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 88 

gcagactggc aattatcagt aacg 
24 

<210> 89 
<211> 25 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 89 

ccatggattc gagtaaagtt gtcgc 
25 

<210> 90 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Oligonucleotide 
<400> 90 

gaattcactt caaaaaaggt aacag 
25 

<210> 91 
<211> 4550 
<212> DNA 

<213> Arabidopsis sp 
<400> 91 

attttacacc aatttgatca cttaactaaa ttaattaaat tagatgatta tcccaccata 
60 

tttttgagca ttaaaccata aaaccatagt tataagtaac tgttttaatc gaatatgact 
120 

cgattaagat taggaaaaat ttataaccgg taattaagaa aacattaacc gtagtaaccg 
180 

taaatgccga ttcctccctt gtctaaaaga cagaaaacat atattttatt ttgccccata 
240 

tgtttcactc tatttaattt caggcacaat acttttggtt ggtaacaaaa ctaaaaagga 
300 

caacacgtga fcacttttcct cgtccgfccag tcagattttt tttaaactag aaacaagtgg 
360 

caaatctaca ccacattttt tgcttaatct atfcaacttgt aagttttaaa ttcctaaaaa 
420 

agtctaacta attcttctaa tataagtaca tfcccctaaat ttcccaaaaa gtcaaattaa 
480 

taattttcaa aatctaatct aaatatctaa taattcaaaa tcattaaaaa gacacgcaac 
540 

aatgacacca attaatcatc ctcgacccac acaattctac agttctcatg ctaaaccata 
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600 

bbbbbbgcbc 
660 

bbcfcbbgbcb 
720 

abggagfcctc 
780 

gtbtcaggb t 
840 

bbgaacbbbb 
900 

fcagabcgaag 
960 

bgaabtbbgb 
1020 

cagaatctaa 
1080 

aabacbcaafc 
1140 

bbbabgagac 
1200 

gbtactgatg 
1260 

gtbctgcgtb 
1320 

aggcctgabg 
1380 

gbtaabgcca 
1440 

agagacbcgb 
1500 

aagtbbcbct 
1560 

baacabfcagc 
1620 

cagbagagaa 
1680 

ababataaca 
1740 

abbgggbbtb 
1800 

ctaaabcagb 
1860 

bbcgagagac 
1920 

gcaggbbaac 
1980 

bgcaatagba 
2040 

tagaatbcba 
2100 

gtggattgbb 
2160 

bgcabacbct 
2220 

bgcagbbbcb 
2280 

ccacbbbbac 
2340 

gcbafcbabtg 
2400 



bcbgbbccbb 
bbgabbfcbbg 
bgcbcbctag 
bbabbbgbbg 
cbgaabataa 
baggbgacaa 
bbcbcabgca 
agcbccacbc 
cabctbagbc 
aabgbabgbb 
bbgbbbagcb 
gtgatbcgag 
gtcaaggabc 
cbgcgggbca 
tagabgcgtb 
tbaaaaabgt 
tcbgbgattg 
ggtbtctgat 
cataatgacc 
gtbbtcaggc 
tgbctgabgb 
bgatgagabb 
aagccctabc 
gcbbccbtcb 
taagbbacbg 
ggbtcatggc 
abcaabgbaa 
agtbbbaggb 
ggbggaaaag 
tbcaaabcgc 



caaaabcabb 
abbbbbbbbc 
bbcbbctcbb 
bbtaggbbbc 
aabaaggaaa 
aggbbatbgb 
bgcaactbab 
bbbabcaggb 
bcabbabbcb 
ggacbbagbb 
cbbbacacca 
baaagbbgbc 
bbcabbgbbg 
gccbgaggcb 
bbacaggbtt 
aacbcbtbta 
gabbbgcagg 
afcabcbccbfc 
gabgaagaag 
bgbbgfctgca 
bgaaabagab 
aabagcagcb 
bbccabbggc 
ccabcabggb 
aaabagtbbg 
cabbgbtcbg 
gbaagbbtcb 
baabgaggbb 
abbbgcabbg 
cbbbbabcba 



bcbt bcbctt 
bcbcbggcgb 
gbbbccgcbg 
gbbbbbgbga 
aagbbtcgab 
gbggagaagc 
caabcagcbg 
tcgbbagggb 
abbggbbgaa 
gaagbbctbc 
abababacac 
gcaaaaccga 
bbgbabccaa 
bbcgacbcga 
bcbaggccbc 
aaacgcaabc 
tgcbbagcab 
bacbbbtcac 
abacabbtbt 
gcbcbcabga 
aaggbaacab 
agbgccbaga 
abcaggagaa 
abggbgccab 
bbabaaabcg 
ggcbcbbbfcb 
caabacbaga 
bbaabaacbb 
gbbgcagcaa 
cababbcagg 



cbbbgabbcc 
gaaggaagaa 
gbaaabcbcg 
bbcagaacca 
bbbbabaabg 
abaabbbcbg 
gbgggttbbg 
bbbatgggbb 
bcacabtbbc 
bcbbbggbba 
ccaabbbbgc 
agbttaggaa 
aacabaagbc 
abagcaaaca 
abacagtbab 
tfcbcagggbb 
bbbabcbgba 
bggcabcbtg 
bbcgbctcbc 
bgaacattba 
gcaaatbbbc 
bcabctcbab 
babbcbgbta 
tbbcacaaaa 
bbabagagtb 
gbgagtbtca 
abtbggcbca 
acbbctacba 
bgbgbabccb 
bacbaaacca 



caaagatcac 
gcbtbabbbc 
bccbbbbcbg 
bacaaaaagb 
aatbgbbbac 
ggcbbgacbb 
bbggaagaag 
bbtgaaabba 
baabbbggaa 
bagbbgaagb 
agaaabccga 
caabcbbgbb 
gagabbbcgg 
gaagtcbtbb 
bggcacagtb 
bbcaaggaga 
bctbbcbbag 
gaggbaatga 
bgtbtaaaca 
catagbbggg 
bbcatatgag 
gbgggbbttb 
acaccggcab 
bbbcaactbb 
bctggcbbgg 
bgcbcggbac 
aatcaaaabc 
caaacagbbg 
cgcbgbccga 
bbbbccbbab 
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gbbfcbgfcagb tgttttcatc aaaabcacbt ttatattact aaagctgtga aacbtbgbbg 
2460 

cagacacatg tgtfcbggaag accaatcttg bbcacfcaggc ctcbtabtbb cgccactgcg 
2520 

bbtabgagct ttttcfcctgt cgttattgca ttgbtbaagg taaacaaaga bggaaaaaga 
2580 

fcbaaabcbab gfcabacbtaa agtaaagcab fcctactgbfca btgabgagaa gtbbtctbbb 
2640 

fcfcggfcbggat gcaggababa ccbgababcg aaggggataa gatabtcgga atccgabcab 
2700 

bcbctgbaac bctgggtcag aaacgggtac gababcbaaa cbaaagaaab bgtbbbgact 
2760 

caagbgbtgg afctaagabfca cagaagaaag aaaacbgbtt bbgbbtcbbg caaaabbcag 
2820 

gbgttbbgga catgbgbtac acbacbtcaa atggcfctacg cbgbtgcaab bcfcagbbgga 
2880 

gccacabcfcc catbcatatg gagcaaagbc atcbcggbaa caabctttct fcbacccatcg 
2940 

aaaacbcgct aatbcatcgt ttgagbggta cbggbttcab bttgbbccgt bctgttgabt 
3000 

bttbfcbcagg ttgbgggtca tgbbabacbc gcaacaactt bgtgggctcg agctaagtcc 
3060 

gtbgabctga gtagcaaaac cgaaabaact tcabgfcbata tgttcabatg gaaggfcbaga 
3120 

fcbcgbbbaba aatagagbcb ttacbgcctt bbtatgcgcb ccaafcbtgga abbaaaabag 
3180 

ccbbtcagbb bcabcgaabc accabbabac bgabaaabbc bcatbbcbgc atcagctctt 
3240 

bbabgcagag tacbtgctgb tacctttttt gaagbgactg acabtagaag agaagaagat 
3300 

ggagabaaaa gaataagtca bcacbatgcfc fccbgbbbtta ttacaagbbc atgaaabbag 
3360 

gbagbgaacb agtgaabbag agtbtbabtc tgaaacatgg cagacbgcaa aaabatgtca 
3420 

aagatatgaa tbbctgbbgg gbaaagaagb ctcbgcbtgg gcaaaabcbb aaggtbcggb 
3480 

gfcgttgabat aatgctaagc gaagaaabcg abbcfcatgba gaaatbtccg aaacbabgbg 
3540 

baaacabgbc agaacatctc catbctatafc cttcbbctgc aagaaagcbc bgtbbbtatc 
3600 

accbaaacbc tttatctctg tgfcagtbaag atatgbatat gbacgbgacb acattttttb 
3660 

gtbgatgtaa tbtgcagaac gtabggattt btgtbagaaa gcatgagbbc gaaagtatab 
3720 

gtbtababab afcggabaabt cagaccbaac gtcgaagctc acaagcataa atbcactacb 
3780 

abagbbbgcb cbgbaabaga bagbbccabt gatgbcbtga aactgtacgt aacfcgcctgg 
3840 

gcgbbbbgbg gttgabactg acbacfcgagt gbtcbfcbgtg agbgbtgtaa gbafcacaaga 
3900 

agaagaatat aggcbcacgg gaacgactgb ggbggaagat gaaabggaga bcabcacgba 
3960 

gcggctttgc caaagaccga gbcacgatcg agfcctatgaa gtcbbfcacag ctgctgatba 
4020 

fcgabtgacca btgcbtagag acgcatbgga atcfcbacfcag ggacbbgccb gggagbttcfc 
4080 

bcaagtacgt gbcagatcat acgabgfcagg agabfcbcacg gcfcbbgabgb gfcbtgtbbgg 
4140 

agbcacaatg cbbaabgggc tbabbggccc aataafcagcb agctcbbbbg ctfctagccgb 
4200 

ttcgtbbgtc ccctggbggt gagbafcbafct agggbabggb gbgaccaaag fccaccagacc 
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4 2 60 

bagagbgaab cbagbagagt ccbagaccab ggbccabggc bbttabbbgb aabbbgaaaa 
4320 

abgaacaabb ctbbbbgbaa ggaaaacbbb batabagbag acgtbbacba babagaaacb 
4380 

agfcbgaacba acbbcgfcgca abbgcabaab aabggfcgbga aabagagggt gcaaaacbca 
4440 

ataaacafcbt cgacgbacca agagfctcgaa acaabaagca aaafcagatbb btfctgcbtca 
4500 

gacbaabfcbg bacaatgaat ggbtaabaaa ccattgaagc bbttabbaab 
4550 

<210> 92 
<2U> 4450 
<212> DNA 

<213> Arabidopsis sp 
<400> 92 

bbbaggbfcac aaaabcaatg atatbgcgba tgtcaactat aaaagccaaa agbaaagccb 
60 

cttgbbbgac cagaaggbca bgatcabtgb abacatacag ccaaactacc bccbggaaga 
120 

aaagacabgg abcccaaaca acaacaatag cbbcbtttac aagaaccagt agtaacbagb 
180 

cacbaabcta aaagagbbaa gbttcagctfc fctctggcaab ggctccbbga bcatbbcaat 
240 

cctgaaggag acccactttg bagcaagacc atgtccbctg bttcacttac agtgtgbcbc 
300 

aaaagfccfcac ttcaabbctb catatatagg ttccbcacac tacagcbtca tcctcafctcg 
360 

btgacagaga gagagtcttb attgaaaact bcbbccaagt acaacbccac baaafcabaab 
420 

agcaccaaac cactbgtbcg acacaaatcb gbacagatat aaaaacacta tbaggttbtc 
480 

caaggcaaab cacabaatbg gabtgbgaaa gagbacaaaa gabaaaccca aabtbbcaba 
540 

cbtbcbacbg cagfccagcac cagabgataa gbcagcfcgbc ccbatbtgcc abccbaacbg 
600 

bcctgabgca gcggccagtg abgcgbaaba tbgccacccb baabcabbag agcgagaaac 
660 

aaaaagaabc aaaagacagt aaabggaabb aggaabcaca aatgagbccb bgbaaagbbb 
720 

atbgagtacc gagabctgca ctgaabccag aaagbgcaag aaaaccbabg gatgcbgbgc 
780 

caaabccagb baaccaaagc bbbgbabbab caccgaabct aagggctgtb gacttaacac 
840 

caacbtbbac abcabcbbcb bbgtccbgga gacacaabab atbagacatb agtccabgga 
900 

aaaaaaatga fcbtaaccbag aababcbcaa aatfcacbtgc abaaaaacbg aacbbgagcb 
960 

gaaabbbbgg gbbcgbagct bgbggcatab acbabbtcat tbbcaabggg ccacaaaggb 
1020 

aacbbbcbbb tctcacbbcb gbbgcaaacg ggaagacbtb babggggcba acbcbbcact 
1080 

baaagbabag aaatcagabg gaaaaggbgg gagabcaggg taabbtbcbt cbfcbabgabb 
1140 

gacaaaagbc gaacabcgaa abggabgcab bbgcabgaga cabgaaacaa aagcbgaaaa 
1200 

agaaafccbgb ggtggbgaag cbagaaaaag aaaacaaagc aagcaababg cacacabbga 
1260 

gabbaactac bbbgcbacbg gbcabaabca aabagatbbb gaagcbaaaa aabaaaaagb 
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1320 

gaababacct 
1380 

tagagaggga 
1440 

gcbccccagb 
1500 

abcabaagaa 
1560 

bbacccaaaa 
1620 

cccctaaaac 
1680 

atcacaaaac 
1740 

gggbgaacca 
1800 

gcaagtaaaa 
1860 

aaagcaactg 
1920 

baggbcbtag 
1980 

taaaacaaaa 
2040 

tggaagaatg 
2100 

bfccaatcgaa 
2160 

gaacgaagtc 
2220 

tgbbgbcgta 
2280 

aacactttcc 
2340 

catactaaag 
2400 

aaacaaactg 
2460 

acctctaaga 
2520 

tccaggatca 
2580 

abcbabbbac 
2640 

ctaaagactt 
2700 

tatataaaat 
2760 

cttaaccact 
2820 

abgagcbcbt 
2880 

agactgcaag 
2940 

cagacataaa 
3000 

accaaccagg 
3060 

acataaacca 
3120 



gabgbgcata 
gtacaataga 
tbabggbcaa 
aabcagaaaa 
bgbaaaccbc 
acggctgcag 
taaagacaag 
babgbgbabg 
aatccaaaca 
cagcccgaga 
ttttgtacga 
caaccataca 
aatccagbba 
aaacata btc 
atcagaaca t 
aabbgabcca 
caccabggbt 
ggatatabaa 
acctbtgfcafc 
agtaatgctc 
gcagccaacg 
atagctctgg 
ccaaacagafc 
caaagaaaac 
ctcccatgct 
gggaagafcca 
aacbacbcca 
fctctfcttafcc 
aaaacacata 
tcctttggga 



aabagtabca 
bggtgcbabg 
accbaaaaag 
babataabgb 
ttcataagtg 
aababacata 
accbgagaac 
tgaabbbbta 
aacctgtaab 
aabccaabcc 
tcaacctgga 
aaatcttgag 
catgaatgct 
cacct bcacc 
gcagabaagc 
acabagaaaa 
acagaaacca 
atbbgacatc 
ctatgbcctg 
cgcaaccaaa 
caabcgacct 
aactagatcc 
tcctgagtaa 
bcaggttbab 
atcaaaaacc 
tbabggattb 
aacbbcbcca 
aagctbcaag 
acbtbabcac 
cgaaaggaaa 



baaacaaggg 
cbtccbbbaa 
gcbbgaggcb 
cbaacbbtga 
ggtaggaaaa 
cbgaaabgag 
atabctbcag 
aacaaacacb 
tgtbaagbbg 
cbtgaaabgg 
bataaaagaa 
cttbacabac 
gbgbabcbac 
atatctaaca 
babbacccaa 
abcaagacca 
tagtbacaca 
actbbabcac 
atcaagcaga 
baaagccaba 
abacaacaat 
abgacgaaac 
gaaacccagt 
agcabbabcc 
bcagctcaag 
gataacbgaa 
cbgatabgba 
agcaagbbag 
abaaaacbaa 
cbababaaac 



bccagcagac 
cbgcagbcca 
gcaabbabaa 
gaagccagaa 
gacaagbaac 
cbcaagbaga 
aabbtgggcc 
bgcaaabacg 
gagaagaabc 
bgbcaaaaga 
abbbgtaaga 
aagcaaccca 
ccbaactact 
ccbgaagtct 
aacagagaba 
gtbccagabg 
aaacabgbbt 
cabaccataa 
bcabtbatag 
tabbtaaaac 
gabggagatt 
atggaacatc 
ggaacbatag 
aabccbgatt 
abcatactac 
aaaagbaaca 
bgbagbcbaa 
bcagaaaaca 
abbbaabgba 
abgcagtcbb 



bccggagaga 
bcctaacaab 
aaacgaabca 
bagabbbaaa 
aaagabgaag 
aaagaabfcbg 
aacbacataa 
cgacbbbagg 
cctaagccba 
ccactggcga 
caacataatc 
bctttgtbta 
aaacacabat 
btcactbbbb 
tgacbggaaa 
tcaaagcaat 
ccbaaaccaa 
gatagcttaa 
tacaaccagc 
bbggaaggct 
cagagbabcg 
gbtataabab 
bacbgbaaca 
bcfcgccaabc 
cbaabtgccb 
gagaaatagc 
caabaabaaa 
bcacagccaa 
abctgacbba 
tcbtbcccbc 
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agctattctfc tcggatggat babaabgaab ctcaaaagtg aaatgtcttg atbcbcagcb 
3180 

acattactca aaggcgaaga taaacttacc acatacaagg ccacgcaagc aaccaagttc 
3240 

caabgggbbt atccaabcga gcaagcttag cabaaccbcb aacbbcbbcb ggbaaabaca 
3300 

aabcbatcca agaagctbcc ttaacaacaa caccabcacb cttctcctta bcabcbbbcb 
3360 

bcggcbbbcc cbccaaaacc gaagaagacg acgacabtcc acaaabbaab cbgbaacbcc 

3420 _ 
aaccaacacc aaaaaacbbc tccbgabgca abbcbcbbcc bbbacbccab acbbggbaab 

3480 

babcabbcca tgaaggataa cacttagtga aaggatbbgb gtaatgggta gtcacaggat 
3540 

bggacaagga tttatgfctgt gattgcaaaa gagcagagga agaagatgga gttacggaga 
3600 

cggaagabbb caacaaccgt cbbgaaacac gggagagccc aaaaaacgcc atcbtbgaga 
3660 

gaaabbgbtg ccbggaagaa acaaagactt gagatttcaa acgbaagtga attcttacga 
3720 

acgaaagcba acbbcbcaag agaabcagab tagtgattcc tcaaaaacaa acaaaactab 
3780 

ctaabttcag ttbcgagtga tgaagccbta agaatcbaga accbccatgg cgbbtctaab 
3840 

cbcbcagaga baatcgaatt ccttaaacaa bcaaagcbta gaaagagaag aacaacaaca 
3900 

acaacaaaaa aaatcagatt aacaaccgac cagagagcaa cgacgacgcc ggcgagaaag 
3960 

agcacgbcgt cbcggagcaa gacttcbtct ccagbaaccc ggabggabcg ttaabgggcc 
4020 

tgtagabbab batabtbggg ccgaaacaab bgggbcagca aaaacttggg ggataatgaa 
4080 

gaaacacgta cagtatgcat ttaggcbcca aabtaattgg ccabataabb cgaatcagab 

4140 ^ ^ , 

aaactaatca acccctacct tacttabbtc tcactgbtbb tabttctacc ttagtagtbg 

4200 ^ u 

aagaaacacb btbatttatc ttbtcgggac ccaaatbbga taggatcggg ccabtactca 

4260 

tgagcgbcag acacatatta gcctbatcag abtagtgggg taaggtbttt ttaattcggb 
4320 

aagaagcaac aatcaabgtc ggagaaabta aagaatctgc abgggcgtgg cgbgatgaba 
4380 

tgtgcabatg gagtcagttg ccgatcabab abaactabbt ataaactaca tataaagact 
4440 

actaatagab 
4450 

<210> 93 
<211> 2850 
<212> DMA 

<213> Arabidopsis sp 
<400> 93 

aabtaaaatb bgagcggtcb aaaccatbag accgbtbaga gabccctcca acccaaaata 
60 

gtcgabtbbc acgbcbbgaa cabababbgg gccttaatcb gbgtggtbag baaagactbb 
120 

babbggbcaa agaaaaacaa ccatggccca acabgtbgab actbbbatbt aabfcatacaa 
180 

gbacccctga abtcbcbgaa abatatbbga bbgacccaga babbaabbtb aabtabcabb 
240 
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tcc tgtaaaa gtgaaggagt caccgtgact cgtcgtaatc tgaaaccaat ctgfcfccatat 
gatgaagaag bbtcbctcgt tctcctccaa 

tggcgaagat ctgttgttfc. bcgttbctct fccaagaatct cbgbfctcbfcc ttcgttacca 
aaccctagac tgattccbtg gbcccgcgaa ttabgtgccg ttaatagctb ctcccagccb 
cjggtetcg. cggaabcaac bgctaagtta gggabcactg gtgfcfcagabc tgabgccaat 
cgagtttbtg ccactgcbac tgccgccgct acagctacag ctaccaccgg bgagabbtcg 
tctagagttg cggcbbtggc tggabtaggg cafccactacg cbcgttgtba btgggagcbt 
tctaaagcta aacttaggta tgtgttt.ct tttebtbtcb catgaaaaab ctgaaaatbt 
ccaattgtbg gattcfctaaa ttctcatttg tttt.bggtt gtagtafcgct tg tgg ttgca 
acfctcbggaa ctgg gtatat tctgggtacg ggaaatgctg caattagcfct cccggggctt 
tgfctacacab gbgcaggaac cabgatgatt gcbgcabcbg cbaattcctb gaatcaggtc 
attgaaatgb tgagaagttc abaaabttcg aatcctbgtt gbgttfcabgb agttgatcbt 
gcttgcttat gbttabgtag btgaaaagtt baaaaabbtc taabccttgg tagbbgabcb 
cgcttgtttg bbbbbtcatt ttctagattb btgagataag caabgatbct aagabgaaaa 
gaacgatgct aaggccattg ccttcaggac gt.fct.gtgt tccacacgct gtbgcatggg 
ctactattgc tggtgcttcb ggtgctbgtt tgttggccag caaggtgaat gtttgfctbfct 
ttatatgtga fcfctcbttgtt ttatgaabgg gtgafctgaga gattatggat ctaaactttt 
gcttccacga caaggttatt gcagacbaat atgtbggctg ctggacttgc atctgccaat 
cttgtacttt atgcgttbgt ttatactccg ttgaagcaac btcaccctat caatacatgg 
gttggcgctg ttgbhggtgc tatcccaccc tbgcttgggt aaabtbttgt bccbbtbcth 
cbttatttta gcagattctg ttttgttgga fctgcbbtt aattcaaaat gtagtcatgg 
Ucaccaatb ctatgcttat ctattttgtg tgtfcgtcagg tgggcggcag cgtctggtca 
gatbtcatac aattcgatga ttcbtccagc tgcbcttb.c tttbggcaga tacctcabbt 
tabggcccbfc gcacabctct gccgcaabga bbatgcagcb ggagggtaag accatabggt 
gtcabatgag atbagaabgb cfcccbtccat gtagtgtbga tctbgaacta gttcaabbbc 
gtggaatgab cagagtgbcc bagatagtgt cacagcagtc gacabtbtag tggctagata 
atgagtbcbt tccgtfcagag ataaacattc gcgaacabbg tttccgctt ccgcgaccca 
actbctgatb bbgtbtcbtg qtaccfcfcot-h - 

1920 9 gcaccetgbb bbcagbfcaca agabgbtgtc acbcttbgat 

ccgbcaggga agagaatagc agcag t g gct ct aag g aact gcbfcbbacab gatcccbctc 
ggbbtcafccg cctatgactg tg.gbcttgfc agabtcafccb bttttttgta gbbtattgac 
tgcabtgcbg babcbgattt tbgcbgttcc btccaatbtt bgtgacagao gggbtaaccb 
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2100 

caagbtggtfc bbgccbcgaa tcaacacttc bcacacbagc aatcgctgca acagcabbbb 
cattcfcaccg agaccggacc afcgcafcaaag caaggaaaat gttccatgcc agtcbbcbcb 
tccttcctgfc tttcabgtcfc ggbcttcttc tacaccgbgb cbcbaatgab aatcagcaac 
aactcgbaga agaagccgga bbaacaaabt cbgbabctgg bgaagbcaaa actcagaggc 
gaaagaaacg tgbggcbcaa cctccggbgg cbbabgccbc bgcbgcaccg tbbcctbtcc 
bcccagcbcc bbccbbcbac tcfcccatgat aaccbbbaag caagctabtg aatbbtbgga 
aacagaaahb aaaaaaaaaa tctgaaaagt tctbaagtbb aabcbbbggb taataatgaa 
gtggagaacg cabacaagtt tabgbabbtt tbcbcabcbc cacabaatbg tattbbbbct 
ctaagtabgt bbcaaatgafc acaaaataca tacbbbatca abbatctgat caaabbgabg 
aabbbttgag cttbgacgtg btaggtcbat cbaabaaacg bagbaacgaa tttggbtbbg 
gaaatgaaab ccgataaccg abgabggbgt agagbbaaac gabtaaaccg ggttggtbaa 
aggbcbcgag bcbcgacggc bgcggaaabc ggaaaabcac gabbgaggac btbgagcbgc 
cacgaagabg gcgabgaggb bgaaabcaab 

*so 5 0 

<210> 94 
<211> 3660 
<212> DNA 

<213> Arabidopsis sp 
<400> 94 

babttgbabb bbbabbgtba aabbbtabga tbtcacccgg babatabcab cccababtaa 
batbagatbb abbbtbtggg cbbbattbgg gbbtbcgabb baaacbgggo ccattcbgct 
tcaatgaaac ocbaabgggt bbbgtttggg cbbbggatfct aaaccgggcc catbcbgcbb 
caabgaaggb ccbbbgbcca acaaaactaa cabccgacac aactagbabb gccaagagga 
bcgbgccaca bggcagbbab bgaatcaaag gccgccaaaa cbgbaacgba gacabbacbb 
abctccggba acggacaacc acbcgbbbcc cgaaacagca actcacagac tcacaccacb 
ccagbcbccg gcbbaacbac caccagagac gabbcbcbcb bccgtcggbb cbatgacbbc 
gabbcbcaac acbgbctcca ccabccacbc bbccagagbb accbccgbcg abcgagbcgg 
agbccbcbcb cbbcggaatb cggabbccgt tgagtbcacb cgccggcgbt ctggtbbcbc 
gacgbbgabc bacgaabcac ccggbagbta gcabbcbgbb ggabagabbg abgaabgbbb 
bctbcgabbb bbbbtttacb gatcbtgbbg bggatcbcbc gbagggcgga gabbbgbbgb 
gcgbgcggcg gagacbgaba cbgabaaagg tabgabbbbb bagbbgbbbb batbbbcbcb 
cbcbbcaaaa tbcbcbtbbc aaacacbgbg gcgbbbgaat tbccgacggc agbbaaabcb 
cagacaccbg acaaggcacc agccggbggb tcaagcabba accagcbtcb cggbabcaaa 
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840 



Sgagcatctc aagaaactgb aabtbbgbbc atctcctcag aabcbbtbaa attatcatat 
ttgbggabaa bgatgbgtba g ttta ggaat bttccbacba aaggtaabcb cbbtfcgagga 
caag tcttgt ttttagctt. gaaatgatgfc gaaaabgbbg tbtgbtagcb aaaaagagbt 
tgttgttata ttcbgtatbc agaabaaatg gaagatbcgb cbbcagcbta caaaaccagt 
cacbtggccb ccactggbtb ggggagtcgfc cbgbggbgcb gcbgcbbcag gbaafccatac 
gaaccbcbtb bggabcatgc aabacbgtac agaaagbbfcb ttcabfcbtcc ttccaabbgb 
ttcttctggc agggaacttfc catbggaccc cagaggabgt bgctaagbcg attcbttgca 
tgatgatgtc bggtccbtgt cttactggct atacacaggt ctggtttbac acaacaaaaa 
gctgacttgt tcttabtcta gtgcabttgc tbggtgcbac aafcaacctag acttgtcgat 
tbccagacaa bcaacgactg gtatgataga gababcgacg caatbaatga gccabatcgt 
ccaattccat cbggagcaat abcagagcca gaggbaactg agacagaaca tfcgtgagctt 
tbatcbcbbt tgtgabtctg atbbctcctt actccbbaaa atgcaggtba ttacacaagt 
ctgggtgcta ttatbgggag gtcttggtab tgctggaata tbagatgtgt gggtaagttg 
gcccfcbctga catbaactag bacagttaaa gggcacatca gabtbgctaa aatcttcccb 
tabcaggcag ggcataccac tcccactgtc ttctabcbbg ctfcbgggagg atcatfcgcba 
bcbbatatat acbcbgcbcc accbcttaag gtaagtbtta btcctaactt ccactctcta 
gtgabaagac actccabcca agtbfcbggag ttttgaabat cgatabctga actgatctca 
ttgcagctaa aacaaaatgg atg ggttgga aatbbbgcac tbggagcaag ctababtagt 
ttgccatggb aagatabctc gtgtatcaat aatabatggc gtbgtbcbca tctcattgat 
ttgttbcttg ctcacbtgac tgabaggbgg gctggccaag catbgtttgg cactcttacg 
ccagatgttg ttgttctaac actcfctgtac agcatagctg gggtactctt ttggcaaacc 
ttbtatgttg cttttfctcgt batcbgttgt aatatgctcb tgcttcafcgt tgtacctbtg 
tgataabgca gbbaggaaba gccabtgtta acgactbcaa aagbgtbgaa ggagabagag 
catbaggacb tcagbcbcbc ccagtagctt ttggcaccga aactgcaaaa tggatatgcg 
ttggtgctat agacabbacb cagcbbbctg btgccggtat gbacbabcca ctgttttbgb 
gcagctgtgg ctfccbatbbc ttbtccttga tctbatcaac tggafcabbca ccaa tggt aa 
agcacaaatt aabgaagcbg aatcaacaaa ggcaaaacab aaaagbacab tctaatgaaa 
tgagctaafcg aagaggaggc abctacbtbb atgtbbcabb agbgtgabbg atggafctbtc 
abtbcatgcb bctaaaacaa gbabbtbcaa eanhrrf^t-,, u 

2580 aa cagtgtcatg aaataacaga acfcfcatatcfc 

tcabttgbac bttfcacbagb ggabgagbta cacast-^n. «. 

2640 y^y^ta cacaatcatb gbtatagaac caaafccaaag 



43/44 



WO 02/33060 PCT/US01/42673 

^agagatca tcattagtat .tgtct.ttt t g gtfcgC a gg atatctatta gcatctggga 
;«ctt.tfc. t gc gttggcg ttggttgctt bgafccattcc tca gattgtg ttccaggtaa ■ 
agac gttaac agtctcacat tataattaat caaattcttg fccactcgfcct gattgcfcaca 

ctcgctfccta taaactgcag fcttaaatact bbcfccaagga cc ctgtC aaa bacgacgtca 
.Jt.cc.ggt aa g tcaactb ag t acaca tg tttgtgttct tttgaaatat ctfctgagagg 
tctcttaatc agaagttgct tgaaacactc .tcttg.tt. ca gg caa gcg cgcagccatt 
Jjjwtgctc ggaatatttg fcaacggcatt agcatcgcaa cactgaaaaa ggcgtatttt 
gatggggttt tgtcgaaagc agaggtgttg acacatcaaa 

aactagttta aaa gattt t g taaaa tgtat g taccgttat tactagaaac aacbcctgtt 
gtatcaattt agcaaaacgg ctgagaaatt gtaabtgatg ttaccgtatt tgcgctccat 
"ttgcttt cctgctcata tcgaggattg gggttt.tgt fgttctgtc acttctctgc 
tttcagaatg tttttgtttt ctgtagtgga ttttaactat tbtcatcacb btttgtabtg 
ajtctaaaca tgt atccaca taaaaaca g t aatatacaaa aatgatactb cctcaaactt 
tjtataatct aaatctaaca actagctagb aacccaacta acttcataca attaatttga 
gaaactacaa agacbagact atacatatgt tatttaacaa cttgaaactg tgtt.tt.ct 
acctgatttt tttct.ttct acagccattt gatatgctgc aatcbtaaca tatcaagtct 
cacgttgfctg gacacaacat actatcacaa gtaagacacc, aagtaaaacc aaccg g caac 
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